-
Notifications
You must be signed in to change notification settings - Fork 27.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[docs] Performance docs tidy up, part 1 #23963
[docs] Performance docs tidy up, part 1 #23963
Conversation
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your PR. I don't think that removing half the guide without putting it elsewhere is the way to go. Maybe you want to write a higher-level guide and leave this one as more advanced, but in general, I don't think we need less doc. The examples showing the benchmarking are interesting to show the result of one technique or another. The anatomy of a model section is important to understand what is at play.
Also cc @lvwerra and @stas00 who contributed the original guide.
|
||
## Anatomy of Model's Operations | ||
However, if the preferred batch size fits into memory, there's no reason to apply memory-optimizing techniques because they can | ||
slow down the training. Just because one can use a large batch size, does not necessarily mean they should. As part of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any situation where one would want to use a lower batch size, so I would remove these two sentences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is from the original doc: "If the desired batch size fits into memory then there is no reason to apply gradient accumulation which will only slow down training. [...] Just because we can does not mean we should use a large batch size."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point increasing the batch size will slow down convergence or even lead to worse performance overall, no? Otherwise you would increase GradAcc indefinitely.
| [Optimizer choice](#optimizer-choice) | Yes | Yes | | ||
| [Data preloading](#data-preloading) | Yes | No | | ||
| [DeepSpeed Zero](#deepspeed-zero) | No | Yes | | ||
| [torch.compile](#using-torchcompile) | Yes | No | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: not aligned (since the rest of the table is).
Model weights are not the only thing that is stored in memory during the training process. Other components consuming GPU | ||
memory are optimizer states, gradients, forward activations saved for gradient computation, temporary buffers, and | ||
functionality-specific memory. By reducing the memory footprint of some of these components, you can optimize overall GPU | ||
memory usage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This paragraph is general and not linked to gradient accumulation, so it should be put before.
|
||
The main speed improvement of mixed precision training comes from saving the activations in half precision (fp16 (float16)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is incorrect. The main speed improvement comes from the use of Tensorcores on modern GPUs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose I interpreted this sentence from the original doc incorrectly: "The main advantage comes from saving the activations in half (16-bit) precision."
|
||
## Anatomy of Model's Memory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's a good idea to remove completely this aprt of the guide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sgugger The main feedback to these parts of the documentation was that it was difficult to find actionable information. It was a large document that was hard to navigate (it is still large, but in my opinion, more actionable). Less doc is precisely what users wanted. Perhaps, the anatomy of the model can move to the conceptual guides, and we can link it here. As for the benchmarking, here's my reasoning for removing the examples - a) the benchmarks will differ depending on a number of factors, and one example does not guarantee the same results for all scenarios, b) there are links to benchmarks at the end of some sections. Perhaps, instead of this exact example throughout all the doc, we could add a separate doc about benchmarking (how to) so that users could do that on their own.
Thank you for adding @lvwerra and @stas00, would love to hear their thoughts!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind moving it but I feel like it contains useful information that should be somewhere in the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, happy making things more concise. Maybe there is a compromise where complexity is added as you read on so causal users find important information at the beginning and more detailed info comes later. Also commented a few comments :)
|
||
## Anatomy of Model's Operations | ||
However, if the preferred batch size fits into memory, there's no reason to apply memory-optimizing techniques because they can | ||
slow down the training. Just because one can use a large batch size, does not necessarily mean they should. As part of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point increasing the batch size will slow down convergence or even lead to worse performance overall, no? Otherwise you would increase GradAcc indefinitely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for slimming this down; the new overview page is quite nice! 👏
I think if we take the content removed from here and put it in a conceptual guide or even a blog post, that'll ensure we don't lose any of the information in the original doc. We can add a link to the removed content so users who want to learn more can check it out.
@@ -9,489 +9,257 @@ Unless required by applicable law or agreed to in writing, software distributed | |||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | |||
--> | |||
|
|||
# Efficient Training on a Single GPU | |||
# Methods and tools for efficient training on a single GPU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also update the title in the toctree
so it matches this.
`torch.compile` has a growing list of backends, which can be found in [backends.py](https://github.com/pytorch/pytorch/blob/master/torch/_dynamo/optimizations/backends.py) | ||
or `torchdynamo.list_backends()` each of which with its optional dependencies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The link to backends.py
leads to a 404, and it'd also be nice to add a link to torchdynamo.list_backends()
`torch.compile` has a growing list of backends, which can be found in [backends.py](https://github.com/pytorch/pytorch/blob/master/torch/_dynamo/optimizations/backends.py) | |
or `torchdynamo.list_backends()` each of which with its optional dependencies. | |
`torch.compile` has a growing list of backends, which can be found in [backends.py](https://github.com/pytorch/pytorch/blob/master/torch/_dynamo/optimizations/backends.py) | |
or `torchdynamo.list_backends()`, each with their optional dependencies. |
|
||
**Debugging backends**: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be easier to read if we throw all this in a table of backend types.
Co-authored-by: Steven Liu <[email protected]>
d14ca2e
to
0e96ac3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating on this! Just left one last comment as the lack of memory saving for mixed precision training when using large models is an issue we see pretty often.
docs/source/en/perf_train_gpu_one.md
Outdated
|
||
**Functionality-specific memory** | ||
The parentheses mean there may be negligible effects on memory utilization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the parenthese is jsut the mixed precision case, I would be fully explicit here: with a small model and a large batch size, there will be some memory saving (not neglibile) but with a large model and a small batch size, the memory use will be larger.
Would you like to take another pass on this PR, @lvwerra ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!
* Enable `ZeroShotAudioClassificationPipelineTests::test_small_model_pt` (#24882) fix Co-authored-by: ydshieh <[email protected]> * Add DINOv2 (#24016) * First draft * More improvements * Convert patch embedding layer * Convert all weights * Make conversion work * Improve conversion script * Fix style * Make all tests pass * Add image processor to auto mapping * Add swiglu ffn * Add image processor to conversion script * Fix conversion of giant model * Fix documentation * Fix style * Fix tests * Address comments * Address more comments * Remove unused arguments * Remove more arguments * Rename parameters * Include mask token * Address comments * Add docstring * Transfer checkpoints * Empty commit * [`InstructBlip`] Fix int8/fp4 issues (#24888) * fix dtype issue * revert `.float()` * fix copies * [`Blip`] Fix blip output name (#24889) * fix blip output name * add property * oops * fix failing test * check if eval dataset is dict (#24877) * check if eval dataset is dict * formatting * Separate CircleCI cache between `main` and `pull` (or other branches) (#24886) * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * [`Llama2`] Add support for Llama 2 (#24891) * add llama * add other readmes * update padding id in readme * add link to paper * fix paths and tokenizer * more nits * styling * fit operation in 2 lines when possible * nits * Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> * add form * update reademe * update readme, we don't have a default pad token * update test and tokenization * LLaMA instead of Llama * nits * add expected text * add greeedy output * styling * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Sylvain Gugger <[email protected]> * sequential device map * skip relevant changes --------- Co-authored-by: Sylvain Gugger <[email protected]> * Disable ipex env var if false (#24885) Disable ipex if in use * Check for accelerate env var when doing CPU only (#24890) Check for use-cpu * Avoid some pipeline tasks to use `use_cache=True` (#24893) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Update tested versions in READMEs (#24895) * Update supported Python and PyTorch versions in readme * Update Python, etc. versions in non-English readmes These were more out of date than in the English readme. This updates all the versions the readmes claim the repository is tested with to the same versions stated in the English readme. Those versions are current at least in the case of the Python and PyTorch versions (and less out of date for the others). * Propagate trailing whitespace fix to model list This runs "make fix-copies". The only change is the removal of whitespace. No actual information or wording is changed. * Update tested TensorFlow to 2.6 in all readmes Per pinning in setup.py Unlike Python and PyTorch, the minimum supported TensorFlow version has not very recently changed, but old versions were listed in all READMEs. * Fix `test_model_parallelism` for `FalconModel` (#24914) * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Fixed issue where ACCELERATE_USE_CPU="False" results in bool(True) (#24907) - This results in cpu mode on Apple Silicon mps * fix typo in BARK_PRETRAINED_MODEL_ARCHIVE_LIST (#24902) fix typo in BARK_PRETRAINED_MODEL_ARCHIVE_LIST suno/barh should be suno/bark * Fix minor llama2.md model doc typos (#24909) Update llama2.md Fix typos in the llama2 model doc * [`Llama2`] replace `self.pretraining_tp` with `self.config.pretraining_tp` (#24906) * add possibility to disable TP * fixup * adapt from offline discussions * [doc] `image_processing_vilt.py` wrong default documented (#24931) [doc] image_processing_vilt.py wrong default * 🌐 [i18n-KO] Translated`tasks/document_question_answering.md` to Korean (#24588) * docs: ko: `document_question_answering.md` * fix: resolve suggestions Co-authored-by: Sohyun Sim <[email protected]> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <[email protected]> Co-authored-by: Sohyun Sim <[email protected]> --------- Co-authored-by: Sohyun Sim <[email protected]> Co-authored-by: Hyeonseo Yun <[email protected]> * Add multi-label text classification support to pytorch example (#24770) * Add text classification example * set the problem type and finetuning task * ruff reformated * fix bug for unseting label_to_id for regression * update README.md * fixed finetuning task * update comment * check if label exists in feature before removing * add useful logging * Deprecate unused OpenLlama architecture (#24922) * Resolve typo in check_repo.py * Specify encoding when opening modeling files * Deprecate the OpenLlama architecture * Add disclaimer pointing to Llama I'm open to different wordings here * Match the capitalisation of LLaMA * replace no_cuda with use_cpu in test_pytorch_examples (#24944) * replace no_cuda with use_cpu in test_pytorch_examples * remove codes that never be used * fix style * Generate: sequence bias can handle same terminations (#24822) * Bump pygments from 2.11.2 to 2.15.0 in /examples/research_projects/decision_transformer (#24949) Bump pygments in /examples/research_projects/decision_transformer Bumps [pygments](https://github.com/pygments/pygments) from 2.11.2 to 2.15.0. - [Release notes](https://github.com/pygments/pygments/releases) - [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES) - [Commits](https://github.com/pygments/pygments/compare/2.11.2...2.15.0) --- updated-dependencies: - dependency-name: pygments dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update processing_vision_text_dual_encoder.py (#24950) Fixing small typo: kwrags -> kwargs * Fix `main_input_name` in `src/transformers/keras_callbacks.py` (#24916) fix Co-authored-by: ydshieh <[email protected]> * [DOCS] Example for `LogitsProcessor` class (#24848) * make docs * fixup * resolved * remove debugs * Revert "fixup" This reverts commit 5e0f636aae0bf8707bc8bdaa6a9427fbf66834ed. * prev (ignore) * fixup broke some files * remove files * reverting modeling_reformer * lang fix * fix type annotations for arguments in training_args (#24550) * testing * example script * fix typehinting * some tests * make test * optional update * Union of arguments * does this fix the issue * remove reports * set default to False * documentation change * None support * does not need None * Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549) * Fix typing annotations for FSDP and DeepSpeed in TrainingArguments * Change dict to Dict * Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments" (#24574) Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)" This reverts commit c5e29d4381d4b9739e6cb427adbca87fbb43a3ad. * Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549) * Fix typing annotations for FSDP and DeepSpeed in TrainingArguments * Change dict to Dict * merge * hacky fix * fixup --------- Co-authored-by: Max Ryabinin <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]> * Bump aiohttp from 3.8.1 to 3.8.5 in /examples/research_projects/decision_transformer (#24954) Bump aiohttp in /examples/research_projects/decision_transformer Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.8.1 to 3.8.5. - [Release notes](https://github.com/aio-libs/aiohttp/releases) - [Changelog](https://github.com/aio-libs/aiohttp/blob/v3.8.5/CHANGES.rst) - [Commits](https://github.com/aio-libs/aiohttp/compare/v3.8.1...v3.8.5) --- updated-dependencies: - dependency-name: aiohttp dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [`RWKV`] Add Gradient Checkpointing support for RWKV (#24955) add GC support for RWKV * Change logic for logging in the examples (#24956) Change logic * Contrastive Search peak memory reduction (#24120) Co-authored-by: Joao Gante <[email protected]> * Fallback for missing attribute `Parameter.ds_numel` (#24942) * [trainer] fallback for deepspeed param count * [trainer] more readable numel count * fix fsdp checkpointing issues (#24926) * fix fsdp load * Update trainer.py * remove saving duplicate state_dict * fix: cast input pixels to appropriate dtype for image_to_text pipelines (#24947) * fix: cast input pixels to appropriate dtype for image_to_text tasks * fix: add casting to pixel inputs of additional models after running copy checks * 🌐 [i18n-KO] Fixed Korean and English `quicktour.md` (#24664) * fix: english/korean quicktour.md * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <[email protected]> Co-authored-by: Sohyun Sim <[email protected]> Co-authored-by: Kihoon Son <[email protected]> * fix: follow glossary * 파인튜닝 -> 미세조정 --------- Co-authored-by: Hyeonseo Yun <[email protected]> Co-authored-by: Sohyun Sim <[email protected]> Co-authored-by: Kihoon Son <[email protected]> * fsdp fixes and enhancements (#24980) * fix fsdp prepare to remove the warnings and fix excess memory usage * Update training_args.py * parity for FSDP+XLA * Update trainer.py * Fix missing spaces in system prompt of Llama2 tokenizer (#24930) * Update tokenization_llama.py * Update tokenization_llama_fast.py * Update src/transformers/models/llama/tokenization_llama_fast.py Co-authored-by: Arthur <[email protected]> * Update src/transformers/models/llama/tokenization_llama.py Co-authored-by: Arthur <[email protected]> * Update src/transformers/models/llama/tokenization_llama.py Co-authored-by: Arthur <[email protected]> * Update src/transformers/models/llama/tokenization_llama_fast.py Co-authored-by: Arthur <[email protected]> --------- Co-authored-by: Arthur <[email protected]> * [`LlamaConfig`] Nit: pad token should be None by default (#24958) * pad token should be None by default * fix tests * nits * Remove tokenizers from the doc table (#24963) * Avoid importing all models when instantiating a pipeline (#24960) * Avoid importing all models when instantiating a pipeline * Remove sums that don't work * Fix type annotation for deepspeed training arg (#24988) * Use main_input_name for include_inputs_for_metrics (#24993) * Fix `llama` tokenization doctest (#24990) fix Co-authored-by: ydshieh <[email protected]> * [`bnb`] Add simple check for bnb import (#24995) add simple check for bnb * [`Llama`] remove persistent `inv_freq` tensor (#24998) remove persistent tensor * improve from_pretrained for zero3 multi gpus mode (#24964) * improve from_pretrained for zero3 multi gpus mode * Add check if torch.distributed.is_initialized * Revert torch.distributed --------- Co-authored-by: Stas Bekman <[email protected]> * Move template doc file to md (#25004) * 🌐 [i18n-KO] Updated Korean `serialization.md` (#24686) fix: update ko/serialization.md * chatgpt draft * [check_config_docstrings.py] improve diagnostics (#25012) * [check_config_docstrings.py] improve diagnostics * style * rephrase * fix * [`logging.py`] set default `stderr` path if `None` (#25033) set default logger * fix(integrations): store serialized `TrainingArgs` to `wandb.config` without sanitization. (#25035) fix: store training args to wandb config without sanitization. Allows resuming runs by reusing the wandb config. Co-authored-by: Bharat Ramanathan <[email protected]> * [docs] Performance docs tidy up, part 1 (#23963) * first pass at the single gpu doc * overview: improved clarity and navigation * WIP * updated intro and deepspeed sections * improved torch.compile section * more improvements * minor improvements * make style * Apply suggestions from code review Co-authored-by: Steven Liu <[email protected]> * feedback addressed * mdx -> md * link fix * feedback addressed --------- Co-authored-by: Steven Liu <[email protected]> * Support GatedRepoError + use raise from (#25034) * Support GatedRepoError + use raise from * Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> * Use token instead of use_auth_token in error messages --------- Co-authored-by: Sylvain Gugger <[email protected]> * Better handling missing SYS in llama conversation tokenizer (#24997) * Better handling missing SYS in llama conversation tokenizer The existing code failed to add SYS if the conversation has history without SYS, but did modify the passed conversation as it did. Rearrange the code so modification to the conversation object are taken into account for token id generation. * Fix formatting with black * Avoid one-liners * Also fix fast tokenizer * Drop List decl * 🌐[i18n-KO] Translated performance.md to Korean (#24883) * dos: ko: performance.md * feat: chatgpt draft * fix: manual edits * fix: manual edits * Update docs/source/ko/performance.md Co-authored-by: Kihoon Son <[email protected]> * Update docs/source/ko/performance.md --------- Co-authored-by: Kihoon Son <[email protected]> * 🌐 [i18n-KO] Translated `testing.md` to Korean (#24900) * docs: ko: testing.md * feat: draft * fix: manual edits * fix: edit ko/_toctree.yml * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: resolve suggestions * Add dispatch_batches to training arguments (#25038) * Dispatch batches * Copy items * Fix typo in LlamaTokenizerFast docstring example (#25018) * Make more test models smaller (#25005) * Make more test models tiny * Make more test models tiny * More models * More models * Comment again print statement * Pvt model (#24720) * pull and push updates * add docs * fix modeling * Add and run test * make copies * add task * fix tests and fix small issues * Checks on a Pull Request * fix docs * add desc pvt.md * compute_loss in trainer failing to label shift for PEFT model when label smoothing enabled. (#25044) * added PeftModelForCausalLM to MODEL_FOR_CAUSAL_LM_MAPPING_NAMES dict * check for PEFT model in compute_loss section --------- Co-authored-by: Nathan Brake <[email protected]> * [`8bit`] Fix 8bit corner case with Blip2 8bit (#25047) fix 8bit corner case with Blip2 8bit * 🌐 [i18n-KO] Translated `perf_train_cpu.md` to Korean (#24911) * dos: ko: perf_train_cpu.md * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions * fix: manual edits Co-authored-by: Haewon Kim <[email protected]> --------- Co-authored-by: Haewon Kim <[email protected]> * Better error message when signal is not supported on OS (#25049) * Better error message when signal is not supported on OS * Address review comments * [`RWKV`] Add note in doc on `RwkvStoppingCriteria` (#25055) * Add note in doc on `RwkvStoppingCriteria` * give some breathing space to the code * Generate - add beam indices output in contrained beam search (#25042) * [Docs] fix rope_scaling doc string (#25072) fix rope_scaling doc string * 🌐 [i18n-KO] Translated `<tf_xla>.md` to Korean (#24904) * docs: ko: tf_xla.md * feat: chatgpt draft * fix: manual edits * fix: manual edits * fix: manual edits * fix: resolve suggestions * 🌐 [i18n-KO] Translated `perf_hardware.md` to Korean (#24966) * docs: ko: perf_hardware.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <[email protected]> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <[email protected]> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <[email protected]> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <[email protected]> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <[email protected]> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <[email protected]> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <[email protected]> * fix: resolve suggestions Co-authored-by: Haewon Kim <[email protected]> * Fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: fix rendering error of perf_hardware.md --------- Co-authored-by: Hyeonseo Yun <[email protected]> Co-authored-by: Haewon Kim <[email protected]> * Fix last models for common tests that are too big. (#25058) * Fix last models for common tests that are too big. * Remove print statement * fix: add TOC anchor link (#25066) * Set `TF32` flag for PyTorch cuDNN backend (#25075) * Fix broken link in README_hd.md (#25067) Update README_hd.md * replace `per_gpu_eval_batch_size` with `per_device_eval_batch_size` in readme of multiple-choice task (#25078) replace `per_gpu_eval_batch_size` with `per_device_eval_batch_size` in readme of multiple-choice * [`generate`] Only warn users if the `generation_config`'s `max_length` is set to the default value (#25030) * check max length is default * nit * update warning: no-longer deprecate * comment in the configuration_utils in case max length's default gets changed in the futur * 🌐 [i18n-KO] Translated `hpo_train.md` to Korean (#24968) * dos: ko: hpo_train.mdx * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions * Fix: repeat per sample for SAM image embeddings (#25074) Repeat per sample for SAM image embeddings * [`MPT`] Add MosaicML's `MPT` model to transformers (#24629) * draft add new model like * some cleaning of the config * nits * add nested configs * nits * update * update * added layer norms + triton kernels * consider only LPLayerNorm for now. * update * all keys match. * Update * fixing nits here and there * working forward pass. * removed einops dependency * nits * format * add alibi * byebye head mask * refactor attention * nits. * format * fix nits. * nuke ande updates * nuke tokenizer test * don't reshape query with kv heads * added a bit of documentation. * remove unneeded things * nuke more stuff * nit * logits match - same generations * rm unneeded methods * 1 remaining failing CI test * nit * fix nits * fix docs * fix docs * rm tokenizer * fixup * fixup * fixup and fix tests * fixed configuration object. * use correct activation * few minor fixes * clarify docs a bit * logits match à 1e-12 * skip and unskip a test * added some slow tests. * fix readme * add more details * Update docs/source/en/model_doc/mpt.md Co-authored-by: Arthur <[email protected]> * Apply suggestions from code review Co-authored-by: Arthur <[email protected]> * fix configuration issues * more fixes in config * added more models * Apply suggestions from code review Co-authored-by: Arthur <[email protected]> * remove unneeded position ids * fix some comments * Apply suggestions from code review Co-authored-by: Arthur <[email protected]> * revert suggestion * mpt alibi + added batched generation * Update src/transformers/models/mpt/__init__.py Co-authored-by: Arthur <[email protected]> * remove init config * Update src/transformers/models/mpt/configuration_mpt.py Co-authored-by: Arthur <[email protected]> * fix nit * add another slow test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> * fits in one line * some refactor because make fixup doesn't pass * add ft notebook * update md * correct doc path --------- Co-authored-by: younesbelkada <[email protected]> Co-authored-by: Younes Belkada <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]> * [DOCS] add example NoBadWordsLogitsProcessor (#25046) * add example NoBadWordsLogitsProcessor * fix L764 & L767 * make style * 🌐 [i18n-KO] Translated `perf_infer_cpu.md` to Korean (#24920) * docs: ko: perf_infer_cpu.md * feat: chatgpt draft * fix: manual edits * Update docs/source/ko/_toctree.yml * Update docs/source/ko/perf_infer_cpu.md * Update docs/source/ko/perf_infer_cpu.md 이 부분은 저도 걸리적거렸던 부분입니다. 반영하겠습니다! Co-authored-by: Wonhyeong Seo <[email protected]> * Update docs/source/ko/perf_infer_cpu.md 동의합니다! 제가 원본에 너무 얽매여 있었네요! Co-authored-by: Wonhyeong Seo <[email protected]> * Update docs/source/ko/perf_infer_cpu.md 말씀하신대로 원문에 너무 집착했던것 같습니다 Co-authored-by: Wonhyeong Seo <[email protected]> * Update docs/source/ko/perf_infer_cpu.md 더 나은 어휘 사용에 감사드립니다! Co-authored-by: Wonhyeong Seo <[email protected]> * Update docs/source/ko/perf_infer_cpu.md 이 당시 '주기'란 용어를 생각해내질 못했네요... Co-authored-by: Wonhyeong Seo <[email protected]> * Update docs/source/ko/perf_infer_cpu.md 좀 더 자연스러운 문맥이 됐네요! Co-authored-by: Wonhyeong Seo <[email protected]> * Update docs/source/ko/perf_infer_cpu.md 굳이 원본 형식에 얽매일 필요가 없군요! Co-authored-by: Wonhyeong Seo <[email protected]> * Update docs/source/ko/perf_infer_cpu.md Co-authored-by: Wonhyeong Seo <[email protected]> --------- Co-authored-by: Wonhyeong Seo <[email protected]> * Allow generic composite models to pass more kwargs (#24927) * fix * Update src/transformers/generation/utils.py Co-authored-by: Joao Gante <[email protected]> * update --------- Co-authored-by: ydshieh <[email protected]> Co-authored-by: Joao Gante <[email protected]> * [ `ForSequenceClassification`] Support `left` padding (#24979) * support left padding * nit * Update src/transformers/models/gpt_neox/modeling_gpt_neox.py * Update src/transformers/models/gpt_neox/modeling_gpt_neox.py * [`TF`] Also apply patch to support left padding (#25085) * tf versions * apply changes to other models * 3 models slipped through the cracks * Edit err message and comment in `test_model_is_small` (#25087) * Edit err message and comment in * put back 80M comment * [ `PreTrainedTokenizerFast`] Keep properties from fast tokenizer (#25053) * draft solution * use `setdefault` * nits * add tests and fix truncation issue * fix test * test passes locally * quality * updates * update tsets * Hotfix for failing `MusicgenForConditionalGeneration` tests (#25091) Co-authored-by: ydshieh <[email protected]> * [`T5`, `MT5`, `UMT5`] Add [T5, MT5, UMT5]ForSequenceClassification (#24726) * Initial addition of t5forsequenceclassification * Adding imports and adding tests * Formatting * Running make fix-copies * Adding mt5forseq * Formatting * run make fix-copies * Adding to docs * Add model_parallel * Fix bug * Fix * Remove TODO * Fixing tests for T5ForSequenceClassification * Undo changes to dependency_versions_table.py * Change classification head to work with T5Config directly * Change seq length to let tests pass * PR comments for formatting * Formatting * Initial addition of UMT5ForSequenceClassification * Adding to inits and formatting * run make fix-copies * Add doc for UMT5ForSeqClass * Update UMT5 config * Fix docs * Skip torch fx test for SequenceClassification * Formatting * Add skip to UMT5 tests as well * Fix umt5 tests * Running make fix-copies * PR comments * Fix for change to sentence_representation * Rename seq_len to hidden_size since that's what it is * Use base_model to follow format of the rest of the library * Update docs * Extract the decoder_input_ids changes and make one liner * Make one-liner * Fix doctest (#25031) fix Co-authored-by: ydshieh <[email protected]> * Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/lxmert (#25096) Bump certifi in /examples/research_projects/lxmert Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22. - [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22) --- updated-dependencies: - dependency-name: certifi dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/decision_transformer (#25098) Bump certifi in /examples/research_projects/decision_transformer Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22. - [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22) --- updated-dependencies: - dependency-name: certifi dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/visual_bert (#25097) Bump certifi in /examples/research_projects/visual_bert Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22. - [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22) --- updated-dependencies: - dependency-name: certifi dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix tied_params for meta tensor (#25101) * fix tied_params for meta tensor * remove duplicate * documentation for llama2 models (#25102) * fix documentation * changes * 🌐[i18n-KO] Translated pipeline_webserver.md to Korean (#24828) * translated pipeline_webserver.md Co-Authored-By: Hyeonseo Yun <[email protected]> Co-Authored-By: Wonhyeong Seo <[email protected]> Co-Authored-By: Sohyun Sim <[email protected]> Co-Authored-By: Gabriel Yang <[email protected]> Co-Authored-By: Nayeon Han <[email protected]> Co-Authored-By: Jungnerd <[email protected]> * Update pipeline_webserver.md * Apply suggestions from code review Co-authored-by: Hyeonseo Yun <[email protected]> Co-authored-by: Sangam Lee <[email protected]> Co-authored-by: Kim haewon <[email protected]> --------- Co-authored-by: Hyeonseo Yun <[email protected]> Co-authored-by: Wonhyeong Seo <[email protected]> Co-authored-by: Sohyun Sim <[email protected]> Co-authored-by: Gabriel Yang <[email protected]> Co-authored-by: Nayeon Han <[email protected]> Co-authored-by: Jungnerd <[email protected]> Co-authored-by: Sangam Lee <[email protected]> Co-authored-by: Kim haewon <[email protected]> * Fix `PvtModelIntegrationTest::test_inference_fp16` (#25106) update Co-authored-by: ydshieh <[email protected]> * Add descriptive docstring to TemperatureLogitsWarper (#24892) * Add descriptive docstring to TemperatureLogitsWarper It addresses https://github.com/huggingface/transformers/issues/24783 * Remove niche features Co-authored-by: Joao Gante <[email protected]> * Commit suggestion Co-authored-by: Joao Gante <[email protected]> * Refactor the examples to simpler ones * Add a missing comma Co-authored-by: Joao Gante <[email protected]> * Make args description more compact Co-authored-by: Joao Gante <[email protected]> * Remove extra text after making description more compact Co-authored-by: Joao Gante <[email protected]> * Fix linter --------- Co-authored-by: Joao Gante <[email protected]> * fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is … (#24772) fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor." Co-authored-by: 刘长伟 <[email protected]> * update `use_auth_token` -> `token` (#25083) * update --------- Co-authored-by: ydshieh <[email protected]> * Fix past CI after #24334 (#25113) update Co-authored-by: ydshieh <[email protected]> * Move common image processing methods to BaseImageProcessor (#25089) Move out common methods * Fix ViT docstring regarding default dropout values. (#25118) Fix docstring for dropout. * MaskFormer - enable return_dict in order to compile (#25052) * Enable return_dict in order to compile * Update tests * Move center_crop to BaseImageProcessor (#25122) * fix deepspeed load best model at end when the model gets sharded (#25057) * fix delete all checkpoints when save_total_limit is set to 1 (#25136) * [`T5/LlamaTokenizer`] default legacy to `None` to not always warn (#25131) default legacy to None * Clarify 4/8 bit loading log message (#25134) * clarify 4/8 bit loading log message * make style * 🚨🚨🚨Change default from `adamw_hf` to `adamw_torch` 🚨🚨🚨 (#25109) * Change defaults * Sylvain's comments * [`MptConfig`] support from pretrained args (#25116) * support from pretrained args * draft addition of tests * update test * use parrent assert true * Update src/transformers/models/mpt/configuration_mpt.py Co-authored-by: Younes Belkada <[email protected]> --------- Co-authored-by: Younes Belkada <[email protected]> * Add offload support to Bark (#25037) * initial Bark offload proposal * use hooks instead of manually offloading * add test of bark offload to cpu feature * Apply nit suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> * Update docstrings of offload Co-authored-by: Sanchit Gandhi <[email protected]> * remove unecessary set_seed in Bark tests --------- Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: Sanchit Gandhi <[email protected]> * More `token` things (#25146) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Add bloom flax (#25094) * First commit * step 1 working * add alibi * placeholder for `scan` * add matrix mult alibi * beta scaling factor for bmm * working v1 - simple forward pass * move layer_number from attribute to arg in call * partial functioning scan * hacky working scan * add more modifs * add test * update scan for new kwarg order * fix position_ids problem * fix bug in attention layer * small fix - do the alibi broadcasting only once * prelim refactor * finish refactor * alibi shifting * incorporate dropout_add to attention module * make style * make padding work again * update * remove bogus file * up * get generation to work * clean code a bit * added small tests * adding albii test * make CI tests pass: - change init weight - add correct tuple for output attention - add scan test - make CI tests work * fix few nits * fix nit onnx * fix onnx nit * add missing dtype args to nn.Modules * remove debugging statements * fix scan generate * Update modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * fix small test issue + make style * clean up * Update tests/models/bloom/test_modeling_flax_bloom.py Co-authored-by: Sanchit Gandhi <[email protected]> * fix function name * small fix test * forward contrib credits from PR17761 * Fix failing test * fix small typo documentation * fix non passing test - remove device from build alibi * refactor call - refactor `FlaxBloomBlockCollection` module * make style * upcast to fp32 * cleaner way to upcast * remove unused args * remove layer number * fix scan test * make style * fix i4 casting * fix slow test * Update src/transformers/models/bloom/modeling_flax_bloom.py Co-authored-by: Sanchit Gandhi <[email protected]> * remove `layer_past` * refactor a bit * fix `scan` slow test * remove useless import * major changes - remove unused code - refactor a bit - revert import `torch` * major refactoring - change build alibi * remove scan * fix tests * make style * clean-up alibi * add integration tests * up * fix batch norm conversion * style * style * update pt-fx cross tests * update copyright * Update src/transformers/modeling_flax_pytorch_utils.py Co-authored-by: Sylvain Gugger <[email protected]> * per-weight check * style * line formats --------- Co-authored-by: younesbelkada <[email protected]> Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: Younes Belkada <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]> * Add new model in doc table of content (#25148) * Fix `.push_to_hub` and cleanup `get_full_repo_name` usage (#25120) * Fix .push_to_hub and cleanup get_full_repo_name usage * Do not rely on Python bool conversion magic * request changes * Add test when downloading from gated repo (#25039) * override .cuda() to check if model is already quantized (#25166) * Represent query_length in a different way to solve jit issue (#25164) Fix jit trace * make run_generation more generic for other devices (#25133) * make run_generation more generic for other devices * use Accelerate to support any device type it supports. * make style * fix error usage of accelerator.prepare_model * use `PartialState` to make sure everything is running on the right device --------- Co-authored-by: statelesshz <[email protected]> * added compiled model support for inference (#25124) * added compiled model support for inference * linter * Fix tests * linter * linter * remove inference mode from pipelines * Linter --------- Co-authored-by: amarkov <[email protected]> * Update `use_auth_token` -> `token` in example scripts (#25167) * pytorch examples * tensorflow examples * flax examples --------- Co-authored-by: ydshieh <[email protected]> * [`Mpt`] Fix mpt slow test (#25170) fix mpt slow test * [`InstructBlip`] Fix instructblip slow test (#25171) * fix instruct blip slow test * Update tests/models/instructblip/test_modeling_instructblip.py * 🌐 [i18n-KO] Translated `transformers_agents.md` to Korean (#24881) * docs: ko: transformers_agents.md * docs: ko: transformers_agents.md * feat: deepl draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Juntae <[email protected]> Co-authored-by: Injin Paek <[email protected]> --------- Co-authored-by: Juntae <[email protected]> Co-authored-by: Injin Paek <[email protected]> * Fix beam search to sample at least 1 non eos token (#25103) (#25115) * [MusicGen] Fix integration tests (#25169) * move to device * update with cuda values * fix fp16 * more rigorous * 🚨🚨🚨 Fix rescale ViVit Efficientnet (#25174) * Fix rescaling bug * Add tests * Update integration tests * Fix up * Update src/transformers/image_transforms.py * Update test - new possible order in list * Musicgen: CFG is manually added (#25173) * Better error message in `_prepare_output_docstrings` (#25202) fix Co-authored-by: ydshieh <[email protected]> * [`PreTrainedModel`] Wrap `cuda` and `to` method correctly (#25206) wrap `cuda` and `to` method correctly * Fix `all_model_classes` in `FlaxBloomGenerationTest` (#25211) fix Co-authored-by: ydshieh <[email protected]> * [quantization.md] fix (#25190) Update quantization.md * [`pipeline`] revisit device check for pipeline (#25207) * revisit device check for pipeline * let's raise an error. * Update tiny model info. and pipeline testing (#25213) * update tiny_model_summary.json * update * update * update --------- Co-authored-by: ydshieh <[email protected]> * Fix docker image build failure (#25214) fix Co-authored-by: ydshieh <[email protected]> * make build_mpt_alibi_tensor a method of MptModel so that deepspeed co… (#25193) make build_mpt_alibi_tensor a method of MptModel so that deepspeed could override it to make autoTP work Signed-off-by: Wang, Yi A <[email protected]> * [`Pix2Struct`] Fix pix2struct cross attention (#25200) * fix pix2struct cross attention * fix torchscript slow test * [`Docs`/`quantization`] Clearer explanation on how things works under the hood. + remove outdated info (#25216) * clearer explanation on how things works under the hood. * Update docs/source/en/main_classes/quantization.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/main_classes/quantization.md Co-authored-by: amyeroberts <[email protected]> * add `load_in_4bit` in `from_pretrained` --------- Co-authored-by: Steven Liu <[email protected]> Co-authored-by: amyeroberts <[email protected]> * [`MPT`] Add `require_bitsandbytes` on MPT integration tests (#25201) * add `require_bitsandbytes` on MPT integration tests * add it on mpt as well * [`Detr`] Fix detr BatchNorm replacement issue (#25230) * fix detr weird issue * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: Sylvain Gugger <[email protected]> * fix copies * fix copies --------- Co-authored-by: Sylvain Gugger <[email protected]> * Move rescale dtype recasting to match torchvision ToTensor (#25229) Move dtype recasting to match torchvision ToTensor * Fix set of model parallel in the Trainer when no GPUs are available (#25239) * fix get_keys_to_not_convert() to return correct modules for full precision inference (#25105) * add test for `get_keys_to_not_convert` * add minimum patch to keep mpt lm_head from 8bit quantization * add reivsion to * add pathname and line number to logging formatter in debug mode (#25203) * add pathname and lineno to logging formatter in debug mode * use TRANSFORMERS_VERBOSITY="detail" to print pathname and lineno * Add `token` arugment in example scripts (#25172) * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * resolving zero3 init when using accelerate config with Trainer (#25227) * resolving zero3 init when using accelerate config with Trainer * refactor * fix * fix import * Update rescale tests - cast to float after rescaling to reflect #25229 (#25259) Rescale tests - cast to float after rescaling to reflect #25229 * Fix some bugs for two stage training of deformable detr (#25045) * Update modeling_deformable_detr.py Fix bugs for two stage training * Update modeling_deformable_detr.py * Add test_two_stage_training to DeformableDetrModelTest --------- Co-authored-by: yupeng.jia <[email protected]> * [DOCS] Add example and modified docs of EtaLogitsWarper (#25125) * added example and modified docs for EtaLogitsWarper * make style * fixed styling issue on 544 * removed error info and added set_seed * Update src/transformers/generation/logits_process.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/generation/logits_process.py Co-authored-by: amyeroberts <[email protected]> * updated the results --------- Co-authored-by: amyeroberts <[email protected]> * Fix return_dict_in_generate bug in InstructBlip generate function (#25246) Fix bug in InstructBlip generate function Previously, the postprocessing conducted on generated sequences in InstructBlip's generate function assumed these sequences were tensors (i.e. that `return_dict_in_generate == False`). This commit checks whether the result of the call to the wrapped language model `generate()` is a tensor, and if not attempts to postprocess the sequence attribute of the returned results object. * Remove `pytest_options={"rA": None}` in CI (#25263) fix Co-authored-by: ydshieh <[email protected]> * 🌐 [i18n-KO] Translated `perf_infer_gpu_many.md` to Korean (#24943) * doc: ko: perf_infer_gpu_many.mdx * feat: chatgpt draft * fix: manual edits * Update docs/source/ko/perf_infer_gpu_many.md Co-authored-by: Jungnerd <[email protected]> --------- Co-authored-by: Jungnerd <[email protected]> * recommend DeepSpeed's Argument Parsing documentation (#25268) * [MMS] Fix mms (#25267) * [MMS] Fix mms * [MMS] Fix mms * fix mms loading * Apply suggestions from code review * make style * Update tests/models/wav2vec2/test_modeling_wav2vec2.py * CI with `num_hidden_layers=2` 🚀🚀🚀 (#25266) * CI with layers=2 --------- Co-authored-by: ydshieh <[email protected]> * CI with `pytest_num_workers=8` for torch/tf jobs (#25274) n8 Co-authored-by: ydshieh <[email protected]> * Docs: Update list of `report_to` logging integrations in docstring (#25281) * Update list of logging integrations in docstring Also update type hint * Also add 'flyte' to report_to callback list * Revert 'report_to' type hint update Due to CLI breaking * Update InstructBLIP & Align values after rescale update (#25209) * Update InstructBLIP values Note: the tests are not independent. Running the test independentely produces different logits compared to running all the integration tests * Update test values after rescale update * Remove left over commented out code * Revert to previous rescaling logic * Update rescale tests * Docs: separate generate section (#25235) Separate generate doc section * Update bark doc (#25234) * add mention to optimization in Bark docs * add offload mention in docs * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <[email protected]> * Update bark docs. * Update bark.md --------- Co-authored-by: Sanchit Gandhi <[email protected]> * add generate method to SpeechT5ForTextToSpeech (#25233) * add generate method to SpeechT5ForTextToSpeech * update speecht5forTTS docstrings * Remove defaults to None in generate docstrings Co-authored-by: Sylvain Gugger <[email protected]> --------- Co-authored-by: Sylvain Gugger <[email protected]> * Add timeout parameter to load_image function (#25184) * Add timeout parameter to load_image function. * Remove line. * Reformat code Co-authored-by: amyeroberts <[email protected]> * Add parameter to docs. --------- Co-authored-by: amyeroberts <[email protected]> * [JAX] Bump min version (#25286) * [JAX] Bump min version * make fixup * [small] llama2.md typo (#25295) `groupe` -> `grouped` * Fix typo: Roberta -> RoBERTa (#25302) * Move usage of deprecated logging.warn to logging.warning (#25310) The former spelling is deprecated and has been discouraged for a while. The latter spelling seems to be more common in this project anyway, so this change ought to be safe. Fixes https://github.com/huggingface/transformers/issues/25283 * Give more memory in test_disk_offload (#25315) * Generate: get generation mode as an enum (#25292) * Add offline mode for agents (#25226) * Add offline mode for agents * Disable second check too * Deal with nested configs better in base class (#25237) * Deal better with nested configs * Fixes * More fixes * Fix last test * Clean up existing configs * Remove hack in MPT Config * Update src/transformers/configuration_utils.py Co-authored-by: Younes Belkada <[email protected]> * Fix setting a nested config via dict in the kwargs * Adapt common test * Add test for nested config load with dict --------- Co-authored-by: Younes Belkada <[email protected]> * Document check copies (#25291) * Document check copies better and add tests * Include header in check for copies * Manual fixes * Try autofix * Fixes * Clean tests * Finalize doc * Remove debug print * More fixes * Make `bark` could have tiny model (#25290) * temp * update * update * update * small dim * small dim * small dim * fix * update * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Document toc check and doctest check scripts (#25319) * Clean doc toc check and make doctest list better * Add to Makefile * [Whisper] Better error message for outdated generation config (#25298) * Remove jnp.DeviceArray since it is deprecated. (#24875) * Remove jnp.DeviceArray since it is deprecated. * Replace all instances of jnp.DeviceArray with jax.Array * Update src/transformers/models/bert/modeling_flax_bert.py --------- Co-authored-by: Sanchit Gandhi <[email protected]> * add CFG for .generate() (#24654) * 🌐 [i18n-KO] Translated `perf_infer_gpu_one.md` to Korean (#24978) * docs: ko: perf_infer_gpu_one * feat: chatgpt draft * fix: manual edits * fix: manual edits * fix: resolve suggestions Co-authored-by: Sohyun Sim <[email protected]> Co-authored-by: TaeYupNoh <[email protected]> * fix: resolve suggestions * fix: resolve suggestions Co-authored-by: Younes Belkada <[email protected]> --------- Co-authored-by: Sohyun Sim <[email protected]> Co-authored-by: TaeYupNoh <[email protected]> Co-authored-by: Younes Belkada <[email protected]> * Update TF pin in docker image (#25343) fix Co-authored-by: ydshieh <[email protected]> * Generalize CFG to allow for positive prompts (#25339) * Generalize CFG to allow for positive prompts * Add documentation, fix the correct class * Loosen output shape restrictions on GPT-style models (#25188) * Loosen output shape restrictions on GPT-style models * Use more self-explanatory variables * Revert "Use more self-explanatory variables" This reverts commit 5fd9ab39119558b7e750f61aa4a19014dccc5ed5. * Allow `trust_remote_code` in example scripts (#25248) * pytorch examples * pytorch mim no trainer * cookiecutter * flax examples * missed line in pytorch run_glue * tensorflow examples * tensorflow run_clip * tensorflow run_mlm * tensorflow run_ner * tensorflow run_clm * pytorch example from_configs * pytorch no trainer examples * Revert "tensorflow run_clip" This reverts commit 261f86ac1f1c9e05dd3fd0291e1a1f8e573781d5. * fix: duplicated argument * Generate: remove Marian hack (#25294) Remove Marian hack * Fix more offload edge cases (#25342) * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Migrate Trainer from `Repository` to `upload_folder` (#25095) * First draft * Deal with progress bars * Update src/transformers/utils/hub.py Co-authored-by: Lucain <[email protected]> * Address review comments * Forgot one * Pin hf_hub * Add argument for push all and fix tests * Fix tests * Address review comments --------- Co-authored-by: Lucain <[email protected]> * Adding more information in help parser on train_file and validation_file (#25324) chorse: adding new doc on train and val * [DOCS] Add `NoRepeatNGramLogitsProcessor` Example for `LogitsProcessor` class (#25186) * Add Description And Example to Docstring * make style corrections * make style * Doc Style Consistent With HF * Apply make style * Modify Docstring * Edit Type in Docstring * Feedback Incorporated * Edit Docstring * make style * Post Review Changes * Review Feedback Incorporated * Styling * Formatting * make style * pep8 * Docs: Added benchmarks for `torch.compile()` for vision models (#24748) * added benchmarks for compile * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Sayak Paul <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: amyeroberts <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: amyeroberts <[email protected]> * added more models * added more models fr * added visualizations * minor fix * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: amyeroberts <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: amyeroberts <[email protected]> * Added links to models and put charts side by side * Added batch comparisons * Added more comparisons * Fix table * Added link to wheel * Update perf_torch_compile.md --------- Co-authored-by: Steven Liu <[email protected]> Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: amyeroberts <[email protected]> * Add mask2former fp16 support (#25093) * Add mask2former fp16 support * Clear consistency/quality issues * Fix consistency/quality (2) * Add integration test for mask2former (fp16 case) * Fix code quality * Add integration test for maskformer (fp16 case) * Add integration test for oneformer (fp16 case) * Remove slow decorator from fp16 tests * Fix lint * Remove usage of full inference and value checks for fp16 * Temporarily comment slow for {mask, mask2, one}former * Add fp16 support to oneformer * Revert "Temporarily comment slow for {mask, mask2, one}former" This reverts commit e5371edabd301cf56079def0421a0a87df307cb0. * Remove dtype conversion noop * [DOCS] Add descriptive docstring to MinNewTokensLength (#25196) * Add descriptive docstring to MinNewTokensLength It addresses https://github.com/huggingface/transformers/issues/24783 * Refine the differences between `min_length` and `min_new_tokens` * Remove extra line * Remove extra arguments in generate * Add a missing space Co-authored-by: amyeroberts <[email protected]> * Run the linter * Add clarification comments --------- Co-authored-by: amyeroberts <[email protected]> * Register ModelOutput subclasses as supported torch.utils._pytree nodes (#25358) * Register ModelOutput subclasses as supported torch.utils._pytree nodes Fixes #25357 where DDP with static_graph=True does not sync gradients when calling backward() over tensors contained in ModelOutput subclasses * Add test for torch pytree ModelOutput serialization and deserialization * Fix `test_model_parallelism` (#25359) * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Add warning for missing attention mask when pad tokens are detected (#25345) * Add attention mask and pad token warning to many of the models * Remove changes under examples/research_projects These files are not maintained by HG. * Skip the warning check during torch.fx or JIT tracing * Switch ordering for the warning and input shape assignment This ordering is a little cleaner for some of the cases. * Add missing line break in one of the files * [ASR Pipeline] Clarify return timestamps (#25344) * [ASR Pipeline] Clarify return timestamps * fix indentation * fix ctc check * fix ctc error message! * fix test * fix other test * add new tests * final comment * MaskFormer, Mask2Former - replace einsum for tracing (#25297) * Replace einsum with ops for tracing * Fix comment * Load state in else (#25318) * Load else * New approach * Propagate * Fix `token` in example template (#25351) fix Co-authored-by: ydshieh <[email protected]> * Enable tests to run on third-party devcies (#25327) * enable unit tests to run on third-party devcies other than CUDA and CPU. * remove the modification that enabled ut on MPS * control test on third-party device by env variable * update --------- Co-authored-by: statelesshz <[email protected]> * 🌐 [i18n-KO] Translated `add_tensorflow_model.md` to Korean (#25017) * docs: ko: add_tensorflow_model.md * feat: chatgpt draft * fix: manual edits * fix: manual edits * fix: resolve suggestions * fix: manual edits * Fix `torch_job` worker(s) crashing (#25374) fix Co-authored-by: ydshieh <[email protected]> * Generate: add config-level validation (#25381) * Fix missing usage of `token` (#25382) * add missing tokens * fix --------- Co-authored-by: ydshieh <[email protected]> * Use small config for `OneFormerModelTest.test_model_with_labels` (#25383) fix Co-authored-by: ydshieh <[email protected]> * Add copied from for image processor methods (#25121) * Add copied from statements for image processors * Move out rescale and normalize to base image processor * Remove rescale and normalize from vit (post rebase) * Update docstrings and tidy up * PR comments * change version (#25387) * [DOCS] Add example for `TopPLogitsWarper` (#25361) * [DOCS] Add example for `TopPLogitsWarper` * fix typo * address review feedback * address review nits * 🌐 [i18n-KO] Translated `perf_train_cpu_many.md` to Korean (#24923) * docs: ko: perf_train_cpu_many.md * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Jungnerd <[email protected]> --------- Co-authored-by: Jungnerd <[email protected]> * 16059 - Add missing type hints for ASTModel (#25364) * 16059 - Add missing type hints for ASTModel * Add an additional type hint Co-authored-by: Matt <[email protected]> --------- Co-authored-by: Matt <[email protected]> * rm useless condition since the previous condition contains it. (#25403) * Fix path for dynamic module creation (#25402) * YOLOS - Revert default return_pixel_mask value (#25404) Revert default return_pixel_mask value * Docs: introduction to generation with LLMs (#25240) Co-authored-by: amyeroberts <[email protected]> Co-authored-by: Steven Liu <[email protected]> * Generate: length validation (#25384) * Improve training args (#25401) * enhanced tips for some training args * make style * Generate: generation config validation fixes in docs (#25405) * 16059 - Add extra type hints for AltCLIPModel (#25399) * Generate: lower severity of parameterization checks (#25407) * VQA task guide (#25244) * initial commit * semi-finished task guide draft * image link * Apply suggestions from code review Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/tasks/visual_question_answering.md Co-authored-by: NielsRogge <[email protected]> * feedback addressed * Apply suggestions from code review Co-authored-by: amyeroberts <[email protected]> * nits addressed --------- Co-authored-by: Steven Liu <[email protected]> Co-authored-by: NielsRogge <[email protected]> Co-authored-by: amyeroberts <[email protected]> * 🌐 [i18n-KO] Translated `add_new_model.md` to Korean (#24957) * docs: ko: add_new_model.md * feat: chatgpt draft * fix: manual edits * fix: change document title * fix: edit with reviewers Co-authored-by: Jungnerd <[email protected]> * fix: edit with reviewers Co-authored-by: Jungnerd <[email protected]> * fix: edit with reviewers Co-authored-by: Jungnerd <[email protected]> * fix: edit with reviewers Co-authored-by: Jungnerd <[email protected]> * fix: edit with reviewers Co-authored-by: SeongWooChoi <[email protected]> * fix: edit with reviewers Co-authored-by: SeongWooChoi <[email protected]> * fix: edit with reviewers Co-authored-by: SeongWooChoi <[email protected]> * fix: edit with reviewers Co-authored-by: Jungnerd <[email protected]> * fix: add anchor to header * Update docs/source/ko/add_new_model.md Co-authored-by: 이서정 <[email protected]> * Update docs/source/ko/add_new_model.md Co-authored-by: 이서정 <[email protected]> * Update docs/source/ko/add_new_model.md Co-authored-by: 이서정 <[email protected]> * fix: edit with reviews * feat: edit toctree --------- Co-authored-by: Wonhyeong Seo <[email protected]> Co-authored-by: Jungnerd <[email protected]> Co-authored-by: SeongWooChoi <[email protected]> Co-authored-by: 이서정 <[email protected]> * 🌐 [i18n-KO] Translated `model_summary.md` to Korean (#24625) * docs: ko: model_summary.md * feat: nmt and manual edit model_summary.mdx * fix: resolve suggestions Co-authored-by: Sohyun Sim <[email protected]> Co-authored-by: Wonhyeong Seo <[email protected]> * fix: resolve suggestions2 Co-authored-by: Sohyun Sim <[email protected]> --------- Co-authored-by: Sohyun Sim <[email protected]> Co-authored-by: Wonhyeong Seo <[email protected]> * Update Bark generation configs and tests (#25409) * update bark generation configs for more coherent parameter * make style * update bark hub repo * aligned sample_beam output selection with beam_search (#25375) * aligned sample_beam specs with beam_search * pull origin main * Revert "pull origin main" This reverts commit 06d356f1137bb52272e120a03636598c44449cf3. * update test_utils.py * fix format * remove comment --------- Co-authored-by: Shogo Fujita <[email protected]> * Enable passing number of channels when inferring data format (#25412) * Bark: flexible generation config overload (#25414) * [DINOv2] Update pooler output (#25392) Update pooler output * 🌐 [i18n-KO] Translated `philosophy.md` to Korean (#25010) * docs: ko: philosophy.md * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions * Doc checks (#25408) * Document check_dummies * Type hints and doc in other files * Document check inits * Add documentation to * Address review comments * Generation: strict generation config validation at save time (#25411) * strict gen config save; Add tests * add note that the warning will be an exception in v4.34 * [WavLM] Fix Arxiv link and authors (#25415) * [WavLM] Fix Arxiv link and authors * make style * Generate: Load generation config when `device_map` is passed (#25413) * Fix rendering for `torch.compile()` docs (#25432) fix rendering * Add `examples` to tests to run when `setup.py` is modified (#25437) fix Co-authored-by: ydshieh <[email protected]> * Fix issue with ratio evaluation steps and auto find batch size (#25436) * Fully rebased solution * 500 * docs: add LLaMA-Efficient-Tuning to awesome-transformers (#25441) Co-authored-by: statelesshz <[email protected]> * GPTQ integration (#25062) * GTPQ integration * Add tests for gptq * support for more quantization model * fix style * typo * fix method * Update src/transformers/modeling_utils.py Co-authored-by: Sylvain Gugger <[email protected]> * add dataclass and fix quantization_method * fix doc * Update tests/quantization/gptq/test_gptq.py Co-authored-by: Younes Belkada <[email protected]> * Apply suggestions from code review Co-authored-by: Younes Belkada <[email protected]> * modify dataclass * add gtpqconfig import * fix typo * fix tests * remove dataset as req arg * remove tokenizer import * add offload cpu quantization test * fix check dataset * modify dockerfile * protect trainer * style * test for config * add more log * overwrite torch_dtype * draft doc * modify quantization_config docstring * fix class name in docstring * Apply suggestions from code review Co-authored-by: Y…
…xt2graph) (#8) * [`Llama2`] Add support for Llama 2 (#24891) * add llama * add other readmes * update padding id in readme * add link to paper * fix paths and tokenizer * more nits * styling * fit operation in 2 lines when possible * nits * Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> * add form * update reademe * update readme, we don't have a default pad token * update test and tokenization * LLaMA instead of Llama * nits * add expected text * add greeedy output * styling * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Sylvain Gugger <[email protected]> * sequential device map * skip relevant changes --------- Co-authored-by: Sylvain Gugger <[email protected]> * Disable ipex env var if false (#24885) Disable ipex if in use * Check for accelerate env var when doing CPU only (#24890) Check for use-cpu * Avoid some pipeline tasks to use `use_cache=True` (#24893) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Update tested versions in READMEs (#24895) * Update supported Python and PyTorch versions in readme * Update Python, etc. versions in non-English readmes These were more out of date than in the English readme. This updates all the versions the readmes claim the repository is tested with to the same versions stated in the English readme. Those versions are current at least in the case of the Python and PyTorch versions (and less out of date for the others). * Propagate trailing whitespace fix to model list This runs "make fix-copies". The only change is the removal of whitespace. No actual information or wording is changed. * Update tested TensorFlow to 2.6 in all readmes Per pinning in setup.py Unlike Python and PyTorch, the minimum supported TensorFlow version has not very recently changed, but old versions were listed in all READMEs. * Fix `test_model_parallelism` for `FalconModel` (#24914) * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Fixed issue where ACCELERATE_USE_CPU="False" results in bool(True) (#24907) - This results in cpu mode on Apple Silicon mps * fix typo in BARK_PRETRAINED_MODEL_ARCHIVE_LIST (#24902) fix typo in BARK_PRETRAINED_MODEL_ARCHIVE_LIST suno/barh should be suno/bark * Fix minor llama2.md model doc typos (#24909) Update llama2.md Fix typos in the llama2 model doc * [`Llama2`] replace `self.pretraining_tp` with `self.config.pretraining_tp` (#24906) * add possibility to disable TP * fixup * adapt from offline discussions * [doc] `image_processing_vilt.py` wrong default documented (#24931) [doc] image_processing_vilt.py wrong default * 🌐 [i18n-KO] Translated`tasks/document_question_answering.md` to Korean (#24588) * docs: ko: `document_question_answering.md` * fix: resolve suggestions Co-authored-by: Sohyun Sim <[email protected]> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <[email protected]> Co-authored-by: Sohyun Sim <[email protected]> --------- Co-authored-by: Sohyun Sim <[email protected]> Co-authored-by: Hyeonseo Yun <[email protected]> * Add multi-label text classification support to pytorch example (#24770) * Add text classification example * set the problem type and finetuning task * ruff reformated * fix bug for unseting label_to_id for regression * update README.md * fixed finetuning task * update comment * check if label exists in feature before removing * add useful logging * Deprecate unused OpenLlama architecture (#24922) * Resolve typo in check_repo.py * Specify encoding when opening modeling files * Deprecate the OpenLlama architecture * Add disclaimer pointing to Llama I'm open to different wordings here * Match the capitalisation of LLaMA * replace no_cuda with use_cpu in test_pytorch_examples (#24944) * replace no_cuda with use_cpu in test_pytorch_examples * remove codes that never be used * fix style * Generate: sequence bias can handle same terminations (#24822) * Bump pygments from 2.11.2 to 2.15.0 in /examples/research_projects/decision_transformer (#24949) Bump pygments in /examples/research_projects/decision_transformer Bumps [pygments](https://github.com/pygments/pygments) from 2.11.2 to 2.15.0. - [Release notes](https://github.com/pygments/pygments/releases) - [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES) - [Commits](https://github.com/pygments/pygments/compare/2.11.2...2.15.0) --- updated-dependencies: - dependency-name: pygments dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update processing_vision_text_dual_encoder.py (#24950) Fixing small typo: kwrags -> kwargs * Fix `main_input_name` in `src/transformers/keras_callbacks.py` (#24916) fix Co-authored-by: ydshieh <[email protected]> * [DOCS] Example for `LogitsProcessor` class (#24848) * make docs * fixup * resolved * remove debugs * Revert "fixup" This reverts commit 5e0f636aae0bf8707bc8bdaa6a9427fbf66834ed. * prev (ignore) * fixup broke some files * remove files * reverting modeling_reformer * lang fix * fix type annotations for arguments in training_args (#24550) * testing * example script * fix typehinting * some tests * make test * optional update * Union of arguments * does this fix the issue * remove reports * set default to False * documentation change * None support * does not need None * Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549) * Fix typing annotations for FSDP and DeepSpeed in TrainingArguments * Change dict to Dict * Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments" (#24574) Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)" This reverts commit c5e29d4381d4b9739e6cb427adbca87fbb43a3ad. * Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549) * Fix typing annotations for FSDP and DeepSpeed in TrainingArguments * Change dict to Dict * merge * hacky fix * fixup --------- Co-authored-by: Max Ryabinin <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]> * Bump aiohttp from 3.8.1 to 3.8.5 in /examples/research_projects/decision_transformer (#24954) Bump aiohttp in /examples/research_projects/decision_transformer Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.8.1 to 3.8.5. - [Release notes](https://github.com/aio-libs/aiohttp/releases) - [Changelog](https://github.com/aio-libs/aiohttp/blob/v3.8.5/CHANGES.rst) - [Commits](https://github.com/aio-libs/aiohttp/compare/v3.8.1...v3.8.5) --- updated-dependencies: - dependency-name: aiohttp dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [`RWKV`] Add Gradient Checkpointing support for RWKV (#24955) add GC support for RWKV * Change logic for logging in the examples (#24956) Change logic * Contrastive Search peak memory reduction (#24120) Co-authored-by: Joao Gante <[email protected]> * Fallback for missing attribute `Parameter.ds_numel` (#24942) * [trainer] fallback for deepspeed param count * [trainer] more readable numel count * fix fsdp checkpointing issues (#24926) * fix fsdp load * Update trainer.py * remove saving duplicate state_dict * fix: cast input pixels to appropriate dtype for image_to_text pipelines (#24947) * fix: cast input pixels to appropriate dtype for image_to_text tasks * fix: add casting to pixel inputs of additional models after running copy checks * 🌐 [i18n-KO] Fixed Korean and English `quicktour.md` (#24664) * fix: english/korean quicktour.md * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <[email protected]> Co-authored-by: Sohyun Sim <[email protected]> Co-authored-by: Kihoon Son <[email protected]> * fix: follow glossary * 파인튜닝 -> 미세조정 --------- Co-authored-by: Hyeonseo Yun <[email protected]> Co-authored-by: Sohyun Sim <[email protected]> Co-authored-by: Kihoon Son <[email protected]> * fsdp fixes and enhancements (#24980) * fix fsdp prepare to remove the warnings and fix excess memory usage * Update training_args.py * parity for FSDP+XLA * Update trainer.py * Fix missing spaces in system prompt of Llama2 tokenizer (#24930) * Update tokenization_llama.py * Update tokenization_llama_fast.py * Update src/transformers/models/llama/tokenization_llama_fast.py Co-authored-by: Arthur <[email protected]> * Update src/transformers/models/llama/tokenization_llama.py Co-authored-by: Arthur <[email protected]> * Update src/transformers/models/llama/tokenization_llama.py Co-authored-by: Arthur <[email protected]> * Update src/transformers/models/llama/tokenization_llama_fast.py Co-authored-by: Arthur <[email protected]> --------- Co-authored-by: Arthur <[email protected]> * [`LlamaConfig`] Nit: pad token should be None by default (#24958) * pad token should be None by default * fix tests * nits * Remove tokenizers from the doc table (#24963) * Avoid importing all models when instantiating a pipeline (#24960) * Avoid importing all models when instantiating a pipeline * Remove sums that don't work * Fix type annotation for deepspeed training arg (#24988) * Use main_input_name for include_inputs_for_metrics (#24993) * Fix `llama` tokenization doctest (#24990) fix Co-authored-by: ydshieh <[email protected]> * [`bnb`] Add simple check for bnb import (#24995) add simple check for bnb * [`Llama`] remove persistent `inv_freq` tensor (#24998) remove persistent tensor * improve from_pretrained for zero3 multi gpus mode (#24964) * improve from_pretrained for zero3 multi gpus mode * Add check if torch.distributed.is_initialized * Revert torch.distributed --------- Co-authored-by: Stas Bekman <[email protected]> * Move template doc file to md (#25004) * 🌐 [i18n-KO] Updated Korean `serialization.md` (#24686) fix: update ko/serialization.md * chatgpt draft * [check_config_docstrings.py] improve diagnostics (#25012) * [check_config_docstrings.py] improve diagnostics * style * rephrase * fix * [`logging.py`] set default `stderr` path if `None` (#25033) set default logger * fix(integrations): store serialized `TrainingArgs` to `wandb.config` without sanitization. (#25035) fix: store training args to wandb config without sanitization. Allows resuming runs by reusing the wandb config. Co-authored-by: Bharat Ramanathan <[email protected]> * [docs] Performance docs tidy up, part 1 (#23963) * first pass at the single gpu doc * overview: improved clarity and navigation * WIP * updated intro and deepspeed sections * improved torch.compile section * more improvements * minor improvements * make style * Apply suggestions from code review Co-authored-by: Steven Liu <[email protected]> * feedback addressed * mdx -> md * link fix * feedback addressed --------- Co-authored-by: Steven Liu <[email protected]> * Support GatedRepoError + use raise from (#25034) * Support GatedRepoError + use raise from * Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> * Use token instead of use_auth_token in error messages --------- Co-authored-by: Sylvain Gugger <[email protected]> * Better handling missing SYS in llama conversation tokenizer (#24997) * Better handling missing SYS in llama conversation tokenizer The existing code failed to add SYS if the conversation has history without SYS, but did modify the passed conversation as it did. Rearrange the code so modification to the conversation object are taken into account for token id generation. * Fix formatting with black * Avoid one-liners * Also fix fast tokenizer * Drop List decl * 🌐[i18n-KO] Translated performance.md to Korean (#24883) * dos: ko: performance.md * feat: chatgpt draft * fix: manual edits * fix: manual edits * Update docs/source/ko/performance.md Co-authored-by: Kihoon Son <[email protected]> * Update docs/source/ko/performance.md --------- Co-authored-by: Kihoon Son <[email protected]> * 🌐 [i18n-KO] Translated `testing.md` to Korean (#24900) * docs: ko: testing.md * feat: draft * fix: manual edits * fix: edit ko/_toctree.yml * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: resolve suggestions * Add dispatch_batches to training arguments (#25038) * Dispatch batches * Copy items * Fix typo in LlamaTokenizerFast docstring example (#25018) * Make more test models smaller (#25005) * Make more test models tiny * Make more test models tiny * More models * More models * Comment again print statement * Pvt model (#24720) * pull and push updates * add docs * fix modeling * Add and run test * make copies * add task * fix tests and fix small issues * Checks on a Pull Request * fix docs * add desc pvt.md * compute_loss in trainer failing to label shift for PEFT model when label smoothing enabled. (#25044) * added PeftModelForCausalLM to MODEL_FOR_CAUSAL_LM_MAPPING_NAMES dict * check for PEFT model in compute_loss section --------- Co-authored-by: Nathan Brake <[email protected]> * [`8bit`] Fix 8bit corner case with Blip2 8bit (#25047) fix 8bit corner case with Blip2 8bit * 🌐 [i18n-KO] Translated `perf_train_cpu.md` to Korean (#24911) * dos: ko: perf_train_cpu.md * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions * fix: manual edits Co-authored-by: Haewon Kim <[email protected]> --------- Co-authored-by: Haewon Kim <[email protected]> * Better error message when signal is not supported on OS (#25049) * Better error message when signal is not supported on OS * Address review comments * [`RWKV`] Add note in doc on `RwkvStoppingCriteria` (#25055) * Add note in doc on `RwkvStoppingCriteria` * give some breathing space to the code * Generate - add beam indices output in contrained beam search (#25042) * [Docs] fix rope_scaling doc string (#25072) fix rope_scaling doc string * 🌐 [i18n-KO] Translated `<tf_xla>.md` to Korean (#24904) * docs: ko: tf_xla.md * feat: chatgpt draft * fix: manual edits * fix: manual edits * fix: manual edits * fix: resolve suggestions * 🌐 [i18n-KO] Translated `perf_hardware.md` to Korean (#24966) * docs: ko: perf_hardware.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <[email protected]> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <[email protected]> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <[email protected]> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <[email protected]> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <[email protected]> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <[email protected]> * fix: resolve suggestions Co-authored-by: Hyeonseo Yun <[email protected]> * fix: resolve suggestions Co-authored-by: Haewon Kim <[email protected]> * Fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: fix rendering error of perf_hardware.md --------- Co-authored-by: Hyeonseo Yun <[email protected]> Co-authored-by: Haewon Kim <[email protected]> * Fix last models for common tests that are too big. (#25058) * Fix last models for common tests that are too big. * Remove print statement * fix: add TOC anchor link (#25066) * Set `TF32` flag for PyTorch cuDNN backend (#25075) * Fix broken link in README_hd.md (#25067) Update README_hd.md * replace `per_gpu_eval_batch_size` with `per_device_eval_batch_size` in readme of multiple-choice task (#25078) replace `per_gpu_eval_batch_size` with `per_device_eval_batch_size` in readme of multiple-choice * [`generate`] Only warn users if the `generation_config`'s `max_length` is set to the default value (#25030) * check max length is default * nit * update warning: no-longer deprecate * comment in the configuration_utils in case max length's default gets changed in the futur * 🌐 [i18n-KO] Translated `hpo_train.md` to Korean (#24968) * dos: ko: hpo_train.mdx * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions * Fix: repeat per sample for SAM image embeddings (#25074) Repeat per sample for SAM image embeddings * [`MPT`] Add MosaicML's `MPT` model to transformers (#24629) * draft add new model like * some cleaning of the config * nits * add nested configs * nits * update * update * added layer norms + triton kernels * consider only LPLayerNorm for now. * update * all keys match. * Update * fixing nits here and there * working forward pass. * removed einops dependency * nits * format * add alibi * byebye head mask * refactor attention * nits. * format * fix nits. * nuke ande updates * nuke tokenizer test * don't reshape query with kv heads * added a bit of documentation. * remove unneeded things * nuke more stuff * nit * logits match - same generations * rm unneeded methods * 1 remaining failing CI test * nit * fix nits * fix docs * fix docs * rm tokenizer * fixup * fixup * fixup and fix tests * fixed configuration object. * use correct activation * few minor fixes * clarify docs a bit * logits match à 1e-12 * skip and unskip a test * added some slow tests. * fix readme * add more details * Update docs/source/en/model_doc/mpt.md Co-authored-by: Arthur <[email protected]> * Apply suggestions from code review Co-authored-by: Arthur <[email protected]> * fix configuration issues * more fixes in config * added more models * Apply suggestions from code review Co-authored-by: Arthur <[email protected]> * remove unneeded position ids * fix some comments * Apply suggestions from code review Co-authored-by: Arthur <[email protected]> * revert suggestion * mpt alibi + added batched generation * Update src/transformers/models/mpt/__init__.py Co-authored-by: Arthur <[email protected]> * remove init config * Update src/transformers/models/mpt/configuration_mpt.py Co-authored-by: Arthur <[email protected]> * fix nit * add another slow test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> * fits in one line * some refactor because make fixup doesn't pass * add ft notebook * update md * correct doc path --------- Co-authored-by: younesbelkada <[email protected]> Co-authored-by: Younes Belkada <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]> * [DOCS] add example NoBadWordsLogitsProcessor (#25046) * add example NoBadWordsLogitsProcessor * fix L764 & L767 * make style * 🌐 [i18n-KO] Translated `perf_infer_cpu.md` to Korean (#24920) * docs: ko: perf_infer_cpu.md * feat: chatgpt draft * fix: manual edits * Update docs/source/ko/_toctree.yml * Update docs/source/ko/perf_infer_cpu.md * Update docs/source/ko/perf_infer_cpu.md 이 부분은 저도 걸리적거렸던 부분입니다. 반영하겠습니다! Co-authored-by: Wonhyeong Seo <[email protected]> * Update docs/source/ko/perf_infer_cpu.md 동의합니다! 제가 원본에 너무 얽매여 있었네요! Co-authored-by: Wonhyeong Seo <[email protected]> * Update docs/source/ko/perf_infer_cpu.md 말씀하신대로 원문에 너무 집착했던것 같습니다 Co-authored-by: Wonhyeong Seo <[email protected]> * Update docs/source/ko/perf_infer_cpu.md 더 나은 어휘 사용에 감사드립니다! Co-authored-by: Wonhyeong Seo <[email protected]> * Update docs/source/ko/perf_infer_cpu.md 이 당시 '주기'란 용어를 생각해내질 못했네요... Co-authored-by: Wonhyeong Seo <[email protected]> * Update docs/source/ko/perf_infer_cpu.md 좀 더 자연스러운 문맥이 됐네요! Co-authored-by: Wonhyeong Seo <[email protected]> * Update docs/source/ko/perf_infer_cpu.md 굳이 원본 형식에 얽매일 필요가 없군요! Co-authored-by: Wonhyeong Seo <[email protected]> * Update docs/source/ko/perf_infer_cpu.md Co-authored-by: Wonhyeong Seo <[email protected]> --------- Co-authored-by: Wonhyeong Seo <[email protected]> * Allow generic composite models to pass more kwargs (#24927) * fix * Update src/transformers/generation/utils.py Co-authored-by: Joao Gante <[email protected]> * update --------- Co-authored-by: ydshieh <[email protected]> Co-authored-by: Joao Gante <[email protected]> * [ `ForSequenceClassification`] Support `left` padding (#24979) * support left padding * nit * Update src/transformers/models/gpt_neox/modeling_gpt_neox.py * Update src/transformers/models/gpt_neox/modeling_gpt_neox.py * [`TF`] Also apply patch to support left padding (#25085) * tf versions * apply changes to other models * 3 models slipped through the cracks * Edit err message and comment in `test_model_is_small` (#25087) * Edit err message and comment in * put back 80M comment * [ `PreTrainedTokenizerFast`] Keep properties from fast tokenizer (#25053) * draft solution * use `setdefault` * nits * add tests and fix truncation issue * fix test * test passes locally * quality * updates * update tsets * Hotfix for failing `MusicgenForConditionalGeneration` tests (#25091) Co-authored-by: ydshieh <[email protected]> * [`T5`, `MT5`, `UMT5`] Add [T5, MT5, UMT5]ForSequenceClassification (#24726) * Initial addition of t5forsequenceclassification * Adding imports and adding tests * Formatting * Running make fix-copies * Adding mt5forseq * Formatting * run make fix-copies * Adding to docs * Add model_parallel * Fix bug * Fix * Remove TODO * Fixing tests for T5ForSequenceClassification * Undo changes to dependency_versions_table.py * Change classification head to work with T5Config directly * Change seq length to let tests pass * PR comments for formatting * Formatting * Initial addition of UMT5ForSequenceClassification * Adding to inits and formatting * run make fix-copies * Add doc for UMT5ForSeqClass * Update UMT5 config * Fix docs * Skip torch fx test for SequenceClassification * Formatting * Add skip to UMT5 tests as well * Fix umt5 tests * Running make fix-copies * PR comments * Fix for change to sentence_representation * Rename seq_len to hidden_size since that's what it is * Use base_model to follow format of the rest of the library * Update docs * Extract the decoder_input_ids changes and make one liner * Make one-liner * Fix doctest (#25031) fix Co-authored-by: ydshieh <[email protected]> * Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/lxmert (#25096) Bump certifi in /examples/research_projects/lxmert Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22. - [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22) --- updated-dependencies: - dependency-name: certifi dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/decision_transformer (#25098) Bump certifi in /examples/research_projects/decision_transformer Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22. - [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22) --- updated-dependencies: - dependency-name: certifi dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/visual_bert (#25097) Bump certifi in /examples/research_projects/visual_bert Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22. - [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22) --- updated-dependencies: - dependency-name: certifi dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix tied_params for meta tensor (#25101) * fix tied_params for meta tensor * remove duplicate * documentation for llama2 models (#25102) * fix documentation * changes * 🌐[i18n-KO] Translated pipeline_webserver.md to Korean (#24828) * translated pipeline_webserver.md Co-Authored-By: Hyeonseo Yun <[email protected]> Co-Authored-By: Wonhyeong Seo <[email protected]> Co-Authored-By: Sohyun Sim <[email protected]> Co-Authored-By: Gabriel Yang <[email protected]> Co-Authored-By: Nayeon Han <[email protected]> Co-Authored-By: Jungnerd <[email protected]> * Update pipeline_webserver.md * Apply suggestions from code review Co-authored-by: Hyeonseo Yun <[email protected]> Co-authored-by: Sangam Lee <[email protected]> Co-authored-by: Kim haewon <[email protected]> --------- Co-authored-by: Hyeonseo Yun <[email protected]> Co-authored-by: Wonhyeong Seo <[email protected]> Co-authored-by: Sohyun Sim <[email protected]> Co-authored-by: Gabriel Yang <[email protected]> Co-authored-by: Nayeon Han <[email protected]> Co-authored-by: Jungnerd <[email protected]> Co-authored-by: Sangam Lee <[email protected]> Co-authored-by: Kim haewon <[email protected]> * Fix `PvtModelIntegrationTest::test_inference_fp16` (#25106) update Co-authored-by: ydshieh <[email protected]> * Add descriptive docstring to TemperatureLogitsWarper (#24892) * Add descriptive docstring to TemperatureLogitsWarper It addresses https://github.com/huggingface/transformers/issues/24783 * Remove niche features Co-authored-by: Joao Gante <[email protected]> * Commit suggestion Co-authored-by: Joao Gante <[email protected]> * Refactor the examples to simpler ones * Add a missing comma Co-authored-by: Joao Gante <[email protected]> * Make args description more compact Co-authored-by: Joao Gante <[email protected]> * Remove extra text after making description more compact Co-authored-by: Joao Gante <[email protected]> * Fix linter --------- Co-authored-by: Joao Gante <[email protected]> * fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is … (#24772) fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor." Co-authored-by: 刘长伟 <[email protected]> * update `use_auth_token` -> `token` (#25083) * update --------- Co-authored-by: ydshieh <[email protected]> * Fix past CI after #24334 (#25113) update Co-authored-by: ydshieh <[email protected]> * Move common image processing methods to BaseImageProcessor (#25089) Move out common methods * Fix ViT docstring regarding default dropout values. (#25118) Fix docstring for dropout. * MaskFormer - enable return_dict in order to compile (#25052) * Enable return_dict in order to compile * Update tests * Move center_crop to BaseImageProcessor (#25122) * fix deepspeed load best model at end when the model gets sharded (#25057) * fix delete all checkpoints when save_total_limit is set to 1 (#25136) * [`T5/LlamaTokenizer`] default legacy to `None` to not always warn (#25131) default legacy to None * Clarify 4/8 bit loading log message (#25134) * clarify 4/8 bit loading log message * make style * 🚨🚨🚨Change default from `adamw_hf` to `adamw_torch` 🚨🚨🚨 (#25109) * Change defaults * Sylvain's comments * [`MptConfig`] support from pretrained args (#25116) * support from pretrained args * draft addition of tests * update test * use parrent assert true * Update src/transformers/models/mpt/configuration_mpt.py Co-authored-by: Younes Belkada <[email protected]> --------- Co-authored-by: Younes Belkada <[email protected]> * Add offload support to Bark (#25037) * initial Bark offload proposal * use hooks instead of manually offloading * add test of bark offload to cpu feature * Apply nit suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> * Update docstrings of offload Co-authored-by: Sanchit Gandhi <[email protected]> * remove unecessary set_seed in Bark tests --------- Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: Sanchit Gandhi <[email protected]> * More `token` things (#25146) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Add bloom flax (#25094) * First commit * step 1 working * add alibi * placeholder for `scan` * add matrix mult alibi * beta scaling factor for bmm * working v1 - simple forward pass * move layer_number from attribute to arg in call * partial functioning scan * hacky working scan * add more modifs * add test * update scan for new kwarg order * fix position_ids problem * fix bug in attention layer * small fix - do the alibi broadcasting only once * prelim refactor * finish refactor * alibi shifting * incorporate dropout_add to attention module * make style * make padding work again * update * remove bogus file * up * get generation to work * clean code a bit * added small tests * adding albii test * make CI tests pass: - change init weight - add correct tuple for output attention - add scan test - make CI tests work * fix few nits * fix nit onnx * fix onnx nit * add missing dtype args to nn.Modules * remove debugging statements * fix scan generate * Update modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * fix small test issue + make style * clean up * Update tests/models/bloom/test_modeling_flax_bloom.py Co-authored-by: Sanchit Gandhi <[email protected]> * fix function name * small fix test * forward contrib credits from PR17761 * Fix failing test * fix small typo documentation * fix non passing test - remove device from build alibi * refactor call - refactor `FlaxBloomBlockCollection` module * make style * upcast to fp32 * cleaner way to upcast * remove unused args * remove layer number * fix scan test * make style * fix i4 casting * fix slow test * Update src/transformers/models/bloom/modeling_flax_bloom.py Co-authored-by: Sanchit Gandhi <[email protected]> * remove `layer_past` * refactor a bit * fix `scan` slow test * remove useless import * major changes - remove unused code - refactor a bit - revert import `torch` * major refactoring - change build alibi * remove scan * fix tests * make style * clean-up alibi * add integration tests * up * fix batch norm conversion * style * style * update pt-fx cross tests * update copyright * Update src/transformers/modeling_flax_pytorch_utils.py Co-authored-by: Sylvain Gugger <[email protected]> * per-weight check * style * line formats --------- Co-authored-by: younesbelkada <[email protected]> Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: Younes Belkada <[email protected]> Co-authored-by: haileyschoelkopf <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]> * Add new model in doc table of content (#25148) * Fix `.push_to_hub` and cleanup `get_full_repo_name` usage (#25120) * Fix .push_to_hub and cleanup get_full_repo_name usage * Do not rely on Python bool conversion magic * request changes * Add test when downloading from gated repo (#25039) * override .cuda() to check if model is already quantized (#25166) * Represent query_length in a different way to solve jit issue (#25164) Fix jit trace * make run_generation more generic for other devices (#25133) * make run_generation more generic for other devices * use Accelerate to support any device type it supports. * make style * fix error usage of accelerator.prepare_model * use `PartialState` to make sure everything is running on the right device --------- Co-authored-by: statelesshz <[email protected]> * added compiled model support for inference (#25124) * added compiled model support for inference * linter * Fix tests * linter * linter * remove inference mode from pipelines * Linter --------- Co-authored-by: amarkov <[email protected]> * Update `use_auth_token` -> `token` in example scripts (#25167) * pytorch examples * tensorflow examples * flax examples --------- Co-authored-by: ydshieh <[email protected]> * [`Mpt`] Fix mpt slow test (#25170) fix mpt slow test * [`InstructBlip`] Fix instructblip slow test (#25171) * fix instruct blip slow test * Update tests/models/instructblip/test_modeling_instructblip.py * 🌐 [i18n-KO] Translated `transformers_agents.md` to Korean (#24881) * docs: ko: transformers_agents.md * docs: ko: transformers_agents.md * feat: deepl draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Juntae <[email protected]> Co-authored-by: Injin Paek <[email protected]> --------- Co-authored-by: Juntae <[email protected]> Co-authored-by: Injin Paek <[email protected]> * Fix beam search to sample at least 1 non eos token (#25103) (#25115) * [MusicGen] Fix integration tests (#25169) * move to device * update with cuda values * fix fp16 * more rigorous * 🚨🚨🚨 Fix rescale ViVit Efficientnet (#25174) * Fix rescaling bug * Add tests * Update integration tests * Fix up * Update src/transformers/image_transforms.py * Update test - new possible order in list * Musicgen: CFG is manually added (#25173) * Better error message in `_prepare_output_docstrings` (#25202) fix Co-authored-by: ydshieh <[email protected]> * [`PreTrainedModel`] Wrap `cuda` and `to` method correctly (#25206) wrap `cuda` and `to` method correctly * Fix `all_model_classes` in `FlaxBloomGenerationTest` (#25211) fix Co-authored-by: ydshieh <[email protected]> * [quantization.md] fix (#25190) Update quantization.md * [`pipeline`] revisit device check for pipeline (#25207) * revisit device check for pipeline * let's raise an error. * Update tiny model info. and pipeline testing (#25213) * update tiny_model_summary.json * update * update * update --------- Co-authored-by: ydshieh <[email protected]> * Fix docker image build failure (#25214) fix Co-authored-by: ydshieh <[email protected]> * make build_mpt_alibi_tensor a method of MptModel so that deepspeed co… (#25193) make build_mpt_alibi_tensor a method of MptModel so that deepspeed could override it to make autoTP work Signed-off-by: Wang, Yi A <[email protected]> * [`Pix2Struct`] Fix pix2struct cross attention (#25200) * fix pix2struct cross attention * fix torchscript slow test * [`Docs`/`quantization`] Clearer explanation on how things works under the hood. + remove outdated info (#25216) * clearer explanation on how things works under the hood. * Update docs/source/en/main_classes/quantization.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/main_classes/quantization.md Co-authored-by: amyeroberts <[email protected]> * add `load_in_4bit` in `from_pretrained` --------- Co-authored-by: Steven Liu <[email protected]> Co-authored-by: amyeroberts <[email protected]> * [`MPT`] Add `require_bitsandbytes` on MPT integration tests (#25201) * add `require_bitsandbytes` on MPT integration tests * add it on mpt as well * [`Detr`] Fix detr BatchNorm replacement issue (#25230) * fix detr weird issue * Update src/transformers/models/conditional_detr/modeling_conditional_detr.py Co-authored-by: Sylvain Gugger <[email protected]> * fix copies * fix copies --------- Co-authored-by: Sylvain Gugger <[email protected]> * Move rescale dtype recasting to match torchvision ToTensor (#25229) Move dtype recasting to match torchvision ToTensor * Fix set of model parallel in the Trainer when no GPUs are available (#25239) * fix get_keys_to_not_convert() to return correct modules for full precision inference (#25105) * add test for `get_keys_to_not_convert` * add minimum patch to keep mpt lm_head from 8bit quantization * add reivsion to * add pathname and line number to logging formatter in debug mode (#25203) * add pathname and lineno to logging formatter in debug mode * use TRANSFORMERS_VERBOSITY="detail" to print pathname and lineno * Add `token` arugment in example scripts (#25172) * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * resolving zero3 init when using accelerate config with Trainer (#25227) * resolving zero3 init when using accelerate config with Trainer * refactor * fix * fix import * Update rescale tests - cast to float after rescaling to reflect #25229 (#25259) Rescale tests - cast to float after rescaling to reflect #25229 * Fix some bugs for two stage training of deformable detr (#25045) * Update modeling_deformable_detr.py Fix bugs for two stage training * Update modeling_deformable_detr.py * Add test_two_stage_training to DeformableDetrModelTest --------- Co-authored-by: yupeng.jia <[email protected]> * [DOCS] Add example and modified docs of EtaLogitsWarper (#25125) * added example and modified docs for EtaLogitsWarper * make style * fixed styling issue on 544 * removed error info and added set_seed * Update src/transformers/generation/logits_process.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/generation/logits_process.py Co-authored-by: amyeroberts <[email protected]> * updated the results --------- Co-authored-by: amyeroberts <[email protected]> * Fix return_dict_in_generate bug in InstructBlip generate function (#25246) Fix bug in InstructBlip generate function Previously, the postprocessing conducted on generated sequences in InstructBlip's generate function assumed these sequences were tensors (i.e. that `return_dict_in_generate == False`). This commit checks whether the result of the call to the wrapped language model `generate()` is a tensor, and if not attempts to postprocess the sequence attribute of the returned results object. * Remove `pytest_options={"rA": None}` in CI (#25263) fix Co-authored-by: ydshieh <[email protected]> * 🌐 [i18n-KO] Translated `perf_infer_gpu_many.md` to Korean (#24943) * doc: ko: perf_infer_gpu_many.mdx * feat: chatgpt draft * fix: manual edits * Update docs/source/ko/perf_infer_gpu_many.md Co-authored-by: Jungnerd <[email protected]> --------- Co-authored-by: Jungnerd <[email protected]> * recommend DeepSpeed's Argument Parsing documentation (#25268) * [MMS] Fix mms (#25267) * [MMS] Fix mms * [MMS] Fix mms * fix mms loading * Apply suggestions from code review * make style * Update tests/models/wav2vec2/test_modeling_wav2vec2.py * CI with `num_hidden_layers=2` 🚀🚀🚀 (#25266) * CI with layers=2 --------- Co-authored-by: ydshieh <[email protected]> * CI with `pytest_num_workers=8` for torch/tf jobs (#25274) n8 Co-authored-by: ydshieh <[email protected]> * Docs: Update list of `report_to` logging integrations in docstring (#25281) * Update list of logging integrations in docstring Also update type hint * Also add 'flyte' to report_to callback list * Revert 'report_to' type hint update Due to CLI breaking * Update InstructBLIP & Align values after rescale update (#25209) * Update InstructBLIP values Note: the tests are not independent. Running the test independentely produces different logits compared to running all the integration tests * Update test values after rescale update * Remove left over commented out code * Revert to previous rescaling logic * Update rescale tests * Docs: separate generate section (#25235) Separate generate doc section * Update bark doc (#25234) * add mention to optimization in Bark docs * add offload mention in docs * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <[email protected]> * Update bark docs. * Update bark.md --------- Co-authored-by: Sanchit Gandhi <[email protected]> * add generate method to SpeechT5ForTextToSpeech (#25233) * add generate method to SpeechT5ForTextToSpeech * update speecht5forTTS docstrings * Remove defaults to None in generate docstrings Co-authored-by: Sylvain Gugger <[email protected]> --------- Co-authored-by: Sylvain Gugger <[email protected]> * Add timeout parameter to load_image function (#25184) * Add timeout parameter to load_image function. * Remove line. * Reformat code Co-authored-by: amyeroberts <[email protected]> * Add parameter to docs. --------- Co-authored-by: amyeroberts <[email protected]> * [JAX] Bump min version (#25286) * [JAX] Bump min version * make fixup * [small] llama2.md typo (#25295) `groupe` -> `grouped` * Fix typo: Roberta -> RoBERTa (#25302) * Move usage of deprecated logging.warn to logging.warning (#25310) The former spelling is deprecated and has been discouraged for a while. The latter spelling seems to be more common in this project anyway, so this change ought to be safe. Fixes https://github.com/huggingface/transformers/issues/25283 * Give more memory in test_disk_offload (#25315) * Generate: get generation mode as an enum (#25292) * Add offline mode for agents (#25226) * Add offline mode for agents * Disable second check too * Deal with nested configs better in base class (#25237) * Deal better with nested configs * Fixes * More fixes * Fix last test * Clean up existing configs * Remove hack in MPT Config * Update src/transformers/configuration_utils.py Co-authored-by: Younes Belkada <[email protected]> * Fix setting a nested config via dict in the kwargs * Adapt common test * Add test for nested config load with dict --------- Co-authored-by: Younes Belkada <[email protected]> * Document check copies (#25291) * Document check copies better and add tests * Include header in check for copies * Manual fixes * Try autofix * Fixes * Clean tests * Finalize doc * Remove debug print * More fixes * Make `bark` could have tiny model (#25290) * temp * update * update * update * small dim * small dim * small dim * fix * update * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Document toc check and doctest check scripts (#25319) * Clean doc toc check and make doctest list better * Add to Makefile * [Whisper] Better error message for outdated generation config (#25298) * Remove jnp.DeviceArray since it is deprecated. (#24875) * Remove jnp.DeviceArray since it is deprecated. * Replace all instances of jnp.DeviceArray with jax.Array * Update src/transformers/models/bert/modeling_flax_bert.py --------- Co-authored-by: Sanchit Gandhi <[email protected]> * add CFG for .generate() (#24654) * 🌐 [i18n-KO] Translated `perf_infer_gpu_one.md` to Korean (#24978) * docs: ko: perf_infer_gpu_one * feat: chatgpt draft * fix: manual edits * fix: manual edits * fix: resolve suggestions Co-authored-by: Sohyun Sim <[email protected]> Co-authored-by: TaeYupNoh <[email protected]> * fix: resolve suggestions * fix: resolve suggestions Co-authored-by: Younes Belkada <[email protected]> --------- Co-authored-by: Sohyun Sim <[email protected]> Co-authored-by: TaeYupNoh <[email protected]> Co-authored-by: Younes Belkada <[email protected]> * Update TF pin in docker image (#25343) fix Co-authored-by: ydshieh <[email protected]> * Generalize CFG to allow for positive prompts (#25339) * Generalize CFG to allow for positive prompts * Add documentation, fix the correct class * Loosen output shape restrictions on GPT-style models (#25188) * Loosen output shape restrictions on GPT-style models * Use more self-explanatory variables * Revert "Use more self-explanatory variables" This reverts commit 5fd9ab39119558b7e750f61aa4a19014dccc5ed5. * Allow `trust_remote_code` in example scripts (#25248) * pytorch examples * pytorch mim no trainer * cookiecutter * flax examples * missed line in pytorch run_glue * tensorflow examples * tensorflow run_clip * tensorflow run_mlm * tensorflow run_ner * tensorflow run_clm * pytorch example from_configs * pytorch no trainer examples * Revert "tensorflow run_clip" This reverts commit 261f86ac1f1c9e05dd3fd0291e1a1f8e573781d5. * fix: duplicated argument * Generate: remove Marian hack (#25294) Remove Marian hack * Fix more offload edge cases (#25342) * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Migrate Trainer from `Repository` to `upload_folder` (#25095) * First draft * Deal with progress bars * Update src/transformers/utils/hub.py Co-authored-by: Lucain <[email protected]> * Address review comments * Forgot one * Pin hf_hub * Add argument for push all and fix tests * Fix tests * Address review comments --------- Co-authored-by: Lucain <[email protected]> * Adding more information in help parser on train_file and validation_file (#25324) chorse: adding new doc on train and val * [DOCS] Add `NoRepeatNGramLogitsProcessor` Example for `LogitsProcessor` class (#25186) * Add Description And Example to Docstring * make style corrections * make style * Doc Style Consistent With HF * Apply make style * Modify Docstring * Edit Type in Docstring * Feedback Incorporated * Edit Docstring * make style * Post Review Changes * Review Feedback Incorporated * Styling * Formatting * make style * pep8 * Docs: Added benchmarks for `torch.compile()` for vision models (#24748) * added benchmarks for compile * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: Sayak Paul <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: amyeroberts <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: amyeroberts <[email protected]> * added more models * added more models fr * added visualizations * minor fix * Update docs/source/en/perf_torch_compile.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: amyeroberts <[email protected]> * Update docs/source/en/perf_torch_compile.md Co-authored-by: amyeroberts <[email protected]> * Added links to models and put charts side by side * Added batch comparisons * Added more comparisons * Fix table * Added link to wheel * Update perf_torch_compile.md --------- Co-authored-by: Steven Liu <[email protected]> Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: amyeroberts <[email protected]> * Add mask2former fp16 support (#25093) * Add mask2former fp16 support * Clear consistency/quality issues * Fix consistency/quality (2) * Add integration test for mask2former (fp16 case) * Fix code quality * Add integration test for maskformer (fp16 case) * Add integration test for oneformer (fp16 case) * Remove slow decorator from fp16 tests * Fix lint * Remove usage of full inference and value checks for fp16 * Temporarily comment slow for {mask, mask2, one}former * Add fp16 support to oneformer * Revert "Temporarily comment slow for {mask, mask2, one}former" This reverts commit e5371edabd301cf56079def0421a0a87df307cb0. * Remove dtype conversion noop * [DOCS] Add descriptive docstring to MinNewTokensLength (#25196) * Add descriptive docstring to MinNewTokensLength It addresses https://github.com/huggingface/transformers/issues/24783 * Refine the differences between `min_length` and `min_new_tokens` * Remove extra line * Remove extra arguments in generate * Add a missing space Co-authored-by: amyeroberts <[email protected]> * Run the linter * Add clarification comments --------- Co-authored-by: amyeroberts <[email protected]> * Register ModelOutput subclasses as supported torch.utils._pytree nodes (#25358) * Register ModelOutput subclasses as supported torch.utils._pytree nodes Fixes #25357 where DDP with static_graph=True does not sync gradients when calling backward() over tensors contained in ModelOutput subclasses * Add test for torch pytree ModelOutput serialization and deserialization * Fix `test_model_parallelism` (#25359) * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Add warning for missing attention mask when pad tokens are detected (#25345) * Add attention mask and pad token warning to many of the models * Remove changes under examples/research_projects These files are not maintained by HG. * Skip the warning check during torch.fx or JIT tracing * Switch ordering for the warning and input shape assignment This ordering is a little cleaner for some of the cases. * Add missing line break in one of the files * [ASR Pipeline] Clarify return timestamps (#25344) * [ASR Pipeline] Clarify return timestamps * fix indentation * fix ctc check * fix ctc error message! * fix test * fix other test * add new tests * final comment * MaskFormer, Mask2Former - replace einsum for tracing (#25297) * Replace einsum with ops for tracing * Fix comment * Load state in else (#25318) * Load else * New approach * Propagate * Fix `token` in example template (#25351) fix Co-authored-by: ydshieh <[email protected]> * Enable tests to run on third-party devcies (#25327) * enable unit tests to run on third-party devcies other than CUDA and CPU. * remove the modification that enabled ut on MPS * control test on third-party device by env variable * update --------- Co-authored-by: statelesshz <[email protected]> * 🌐 [i18n-KO] Translated `add_tensorflow_model.md` to Korean (#25017) * docs: ko: add_tensorflow_model.md * feat: chatgpt draft * fix: manual edits * fix: manual edits * fix: resolve suggestions * fix: manual edits * Fix `torch_job` worker(s) crashing (#25374) fix Co-authored-by: ydshieh <[email protected]> * Generate: add config-level validation (#25381) * Fix missing usage of `token` (#25382) * add missing tokens * fix --------- Co-authored-by: ydshieh <[email protected]> * Use small config for `OneFormerModelTest.test_model_with_labels` (#25383) fix Co-authored-by: ydshieh <[email protected]> * Add copied from for image processor methods (#25121) * Add copied from statements for image processors * Move out rescale and normalize to base image processor * Remove rescale and normalize from vit (post rebase) * Update docstrings and tidy up * PR comments * change version (#25387) * [DOCS] Add example for `TopPLogitsWarper` (#25361) * [DOCS] Add example for `TopPLogitsWarper` * fix typo * address review feedback * address review nits * 🌐 [i18n-KO] Translated `perf_train_cpu_many.md` to Korean (#24923) * docs: ko: perf_train_cpu_many.md * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions Co-authored-by: Jungnerd <[email protected]> --------- Co-authored-by: Jungnerd <[email protected]> * 16059 - Add missing type hints for ASTModel (#25364) * 16059 - Add missing type hints for ASTModel * Add an additional type hint Co-authored-by: Matt <[email protected]> --------- Co-authored-by: Matt <[email protected]> * rm useless condition since the previous condition contains it. (#25403) * Fix path for dynamic module creation (#25402) * YOLOS - Revert default return_pixel_mask value (#25404) Revert default return_pixel_mask value * Docs: introduction to generation with LLMs (#25240) Co-authored-by: amyeroberts <[email protected]> Co-authored-by: Steven Liu <[email protected]> * Generate: length validation (#25384) * Improve training args (#25401) * enhanced tips for some training args * make style * Generate: generation config validation fixes in docs (#25405) * 16059 - Add extra type hints for AltCLIPModel (#25399) * Generate: lower severity of parameterization checks (#25407) * VQA task guide (#25244) * initial commit * semi-finished task guide draft * image link * Apply suggestions from code review Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/tasks/visual_question_answering.md Co-authored-by: NielsRogge <[email protected]> * feedback addressed * Apply suggestions from code review Co-authored-by: amyeroberts <[email protected]> * nits addressed --------- Co-authored-by: Steven Liu <[email protected]> Co-authored-by: NielsRogge <[email protected]> Co-authored-by: amyeroberts <[email protected]> * 🌐 [i18n-KO] Translated `add_new_model.md` to Korean (#24957) * docs: ko: add_new_model.md * feat: chatgpt draft * fix: manual edits * fix: change document title * fix: edit with reviewers Co-authored-by: Jungnerd <[email protected]> * fix: edit with reviewers Co-authored-by: Jungnerd <[email protected]> * fix: edit with reviewers Co-authored-by: Jungnerd <[email protected]> * fix: edit with reviewers Co-authored-by: Jungnerd <[email protected]> * fix: edit with reviewers Co-authored-by: SeongWooChoi <[email protected]> * fix: edit with reviewers Co-authored-by: SeongWooChoi <[email protected]> * fix: edit with reviewers Co-authored-by: SeongWooChoi <[email protected]> * fix: edit with reviewers Co-authored-by: Jungnerd <[email protected]> * fix: add anchor to header * Update docs/source/ko/add_new_model.md Co-authored-by: 이서정 <[email protected]> * Update docs/source/ko/add_new_model.md Co-authored-by: 이서정 <[email protected]> * Update docs/source/ko/add_new_model.md Co-authored-by: 이서정 <[email protected]> * fix: edit with reviews * feat: edit toctree --------- Co-authored-by: Wonhyeong Seo <[email protected]> Co-authored-by: Jungnerd <[email protected]> Co-authored-by: SeongWooChoi <[email protected]> Co-authored-by: 이서정 <[email protected]> * 🌐 [i18n-KO] Translated `model_summary.md` to Korean (#24625) * docs: ko: model_summary.md * feat: nmt and manual edit model_summary.mdx * fix: resolve suggestions Co-authored-by: Sohyun Sim <[email protected]> Co-authored-by: Wonhyeong Seo <[email protected]> * fix: resolve suggestions2 Co-authored-by: Sohyun Sim <[email protected]> --------- Co-authored-by: Sohyun Sim <[email protected]> Co-authored-by: Wonhyeong Seo <[email protected]> * Update Bark generation configs and tests (#25409) * update bark generation configs for more coherent parameter * make style * update bark hub repo * aligned sample_beam output selection with beam_search (#25375) * aligned sample_beam specs with beam_search * pull origin main * Revert "pull origin main" This reverts commit 06d356f1137bb52272e120a03636598c44449cf3. * update test_utils.py * fix format * remove comment --------- Co-authored-by: Shogo Fujita <[email protected]> * Enable passing number of channels when inferring data format (#25412) * Bark: flexible generation config overload (#25414) * [DINOv2] Update pooler output (#25392) Update pooler output * 🌐 [i18n-KO] Translated `philosophy.md` to Korean (#25010) * docs: ko: philosophy.md * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions * Doc checks (#25408) * Document check_dummies * Type hints and doc in other files * Document check inits * Add documentation to * Address review comments * Generation: strict generation config validation at save time (#25411) * strict gen config save; Add tests * add note that the warning will be an exception in v4.34 * [WavLM] Fix Arxiv link and authors (#25415) * [WavLM] Fix Arxiv link and authors * make style * Generate: Load generation config when `device_map` is passed (#25413) * Fix rendering for `torch.compile()` docs (#25432) fix rendering * Add `examples` to tests to run when `setup.py` is modified (#25437) fix Co-authored-by: ydshieh <[email protected]> * Fix issue with ratio evaluation steps and auto find batch size (#25436) * Fully rebased solution * 500 * docs: add LLaMA-Efficient-Tuning to awesome-transformers (#25441) Co-authored-by: statelesshz <[email protected]> * GPTQ integration (#25062) * GTPQ integration * Add tests for gptq * support for more quantization model * fix style * typo * fix method * Update src/transformers/modeling_utils.py Co-authored-by: Sylvain Gugger <[email protected]> * add dataclass and fix quantization_method * fix doc * Update tests/quantization/gptq/test_gptq.py Co-authored-by: Younes Belkada <[email protected]> * Apply suggestions from code review Co-authored-by: Younes Belkada <[email protected]> * modify dataclass * add gtpqconfig import * fix typo * fix tests * remove dataset as req arg * remove tokenizer import * add offload cpu quantization test * fix check dataset * modify dockerfile * protect trainer * style * test for config * add more log * overwrite torch_dtype * draft doc * modify quantization_config docstring * fix class name in docstring * Apply suggestions from code review Co-authored-by: Younes Belkada <[email protected]> * more warning * fix 8bit kwargs tests * peft compatibility * remove var * fix is_gptq_quantized * remove is_gptq_quantized * fix wrap * Update src/transformers/modeling_utils.py Co-authored-by: Younes Belkada <[email protected]> * add exllama * skip test * overwrite float16 * style * fix skip test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> * fix docsting formatting * add doc * better test --------- Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: Younes Belkada <[email protected]> * Fix for #25437 (#25454) * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * not debugged code * reference code so nothing is lost * novelty * added docstrings * fixed some relative import errors * fixed small bugs * added linear layers to bloom * removed impossible embedding method * Update src/transformers/models/bloom/desequence_graph_ids.py Co-au…
The performance and scalability section of the docs is difficult to navigate and find actionable advice in, according to feedback. Some topics are missing too. As part of the 2023 docs roadmap, we planned to refactor the section to make the actionable bits easier and quicker to find.
The refactor will come in several PRs to split it into more manageable chunks. This is the first part where you'll find:
Overall, I think that this section should now be easier to navigate and find actionable pieces with reasonable amount of explanation left in the doc, and relevant links for more information.